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TITLE OF THE INVENTION 
HEPATITIS C VIRUS VACCINE 

RELATED APPLICATIONS 
5 The present application claims priority to provisional applications U.S. 

Serial No. 60/363,774, filed March 13, 2002, and U.S. Serial No. 60/328,655, filed 
October 11, 2001, each of which are hereby incorporated by reference herein. 

BACKGROUND OF THE INVENTION 

10 The references cited in the present application are not admitted to be 

prior art to the claimed invention. 

About 3% of the world's population are infected with the Hepatitis C 
virus (HCV). (Wasley et al, Semin. Liver Dis. 20, 1-16, 2000.) Exposure to HCV 
results in an overt acute disease in a small percentage of cases, while in most 

15 instances the virus establishes a chronic infection causing liver inflammation and 

slowly progresses into liver failure and cirrhosis. (Iwarson, FEMS Microbiol. Rev. 14, 
201-204,1994.) In addition, epidemiological surveys indicate an important role of 
HCV in the pathogenesis of hepatocellular carcinoma. (Kew, FEMS Microbiol. Rev. 
14, 211-220, 1994, Alter, Blood 85, 1681-1695, 1995.) 

20 Prior to the implementation of routine blood screening for HCV in 

1992, most infections were contracted by inadvertent exposure to contaminated blood, 
blood products of transplanted organs. In those areas where blood screening of HCV 
is carried out, HCV is primarily contracted through direct percutaneous exposure to 
infected blood, Lei, intravenous drug use. Less frequent methods of transmission 

25 include perinatal exposure, hemodialysis, and sexual contact with an HCV infected 
person. (Alter et al, N. Engl /. Med 341(8), 556-562, 1999, Alter, J. Hepatol. 31 
Suppl. 88-91, 1999. Semin. Liver. Dis. 201, 1-16, 2000.) 

The HCV genome consists of a single strand RNA about 9.5 kb 
encoding a precursor polyprbtein of about 3000 amino acids. (Choo et ai t Science 

30 244, 362-364, 1989, Choo etal , Science 244, 359-36:2, 1989, Takamizawa et al, J. 
Virol 65, 1 105-1 1 13, 1991.) The HCV pdlyprotein contains the viral proteins in the 
order: C-El-E2-p7-NS2-NS3-NS4A-NS4B-NS5A-NS5B. 

Individual viral proteins are produced by proteolysis of the HCV 
polyproteiri. Host cell proteases release the putative structural proteins C, El, E2, and 
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p7, and create the N-terminus of NS2 at amino acid 8 10. (Mizushima et al, J. Virol 
68, 2731-2734, 1994, Hijikata et al., P.N. AS, USA 90, 10773-10777, 1993.) 

The non-structural proteins NS3, NS4A, NS4B, NS5A and NS5B 
presumably form the virus replication machinery and are released from the 
5 polyprotein r A zinc-dependent protease associated with NS2 and the N-terminus of 
NS3 is responsible for cleavage between NS2 and NS3. (Grakoui et al, /. Virol 67, 
1385-1395, 1993, Hijikata et al, PNAS. USA 90, 10773-10777, 1993.) A distinct 
serine protease located in the N-terminal domain of NS3 is responsible for proteolytic 
cleavages at the NS3/NS4A, NS4A/NS4B, NS4B/NS5A and NS5A/NS5B junctions. 
10 (Bartenschlager et al, J. Virol 67, 3835-3844, 1993, Grakoui et al, Proc. Natl Acad, 
ScL USA 90, 10583-10587, 1993, Tomei etaL.J. Virol 67,4017-4026, 1993.) 
NS4A provides a cofactor for NS3 activity. (Failla et al, J. Virol 68, 3753-3760, 
1994, De Francesco et al, U.S. Patent No. 5,739,002.) 

NS5A is a highly phosphorylated protein conferring interferon 
15 resistance. (De Francesco et al, Semin. Liver Dis. , 20(1), 69-83, 2000, Pawlotsky, 
Viral Hepat. Suppl 1, 47-48, 1999.) 

NS5B provides an RNA-dependent RNA polymerase. (De Francesco 
et al, International Publication Number WO 96/37619, Behrens et al, EMBO i5, 12- 
22, 1996, Lohmann et al, Virology 249, 108-1 18, 1998.) 

20 

SUMMARY OF THE INVENTION 

The present invention features Ad6 vectors and a nucleic acid encoding 
a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide containing an inactive NS5B 
RNA-dependent RNA polymerase region. The nucleic acid is particularly useful as a 
25 component of an adeno vector or DNA plasmid vaccine providing a broad range of 
antigens for generating an HCV specific cell mediated immune (CMI) response 
against HCV. 

A A HCV specific CMI response refers to the production of cytotoxic T 

lymphocytes and T helper cells that recognize an HCV antigen. The CMI response 

30 may also include non-HCV specific immune effects. 

Preferred nucleic acids encode a Met-NS3-NS4A-NS4B-NS5 A-NS5B 
polypeptide that is substantially similar to SEQ. ID. NO. 1 and has sufficient protease 
activity to process itself to produce at least a polypeptide substantially similar to the 
NS5B region present in SEQ. ID. NO. 1. The produced polypeptide corresponding to 

35 NS5B is enzymatically inactive. More preferably, the HCV polypeptide has sufficient 
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protease activity to produce polypeptides substantially similar to the NS3, NS4A, . , 
NS4B, NS5A, and NS5B regions present in SEQ. ID. NO. L 

Reference to a "substantially similar sequence" indicates ail identity of at • 
least about 65% to a reference sequence. Thus, for example, polypeptides having an amino 
5 acid sequence substantially similar to SEQ. ID. NO. 1 have an overall amino acid identity 
of at least about 65% to SEQ. ID. NO. L 

Polypeptides corresponding to NS3, NS4A, NS4B, NS5A, and NS5B 
have an amino acid sequence identity of at least about 65% to the corresponding 
region in SEQ. ID. NO. 1. Such corresponding polypeptides are also referred to 
10 herein as NS3, NS4A, NS4B, NS5 A, and NS5B polypeptides. 

Thus, a first aspect of the present invention describes a nucleic acid 
comprising a nucleotide sequence encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide substantially similar to SEQ. ID. NO. L The encoded polypeptide has 
sufficient protease activity to process itself to produce an NS5B polypeptide that is 
15 enzymatically inactive. 

In a preferred embodiment, the nucleic acid is an expression vector 
capable of expressing the Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide in a 
desired human cell. Expression inside a human cell has therapeutic applications for 
actively treating an HCV infection and for prophylactically treating against ah HCV 
20 infection. 

An expression vector contains a nucleotide sequence encoding a 
polypeptide along with regulatory elements for proper transcription and processing. 
The regulatory elements that may be present include those naturally associated with 
the nucleotide sequence encoding the polypeptide and exogenous regulatory elements 
25 not naturally associated with the nucleotide sequence. Exogenous regulatory elements ; 
such as an exogenous promoter can be useful for expression in a particular host, such 
as in a human cell. Examples of regulatory elements useful for functional expression^ v 
include a promoter, a terminator, a ribosome binding site, and a polyadenylatiori 
signal. 

30 Another aspect of the present invention describes a nucleic acid 

comprising a gene expression cassette able to express in a human cell a Met-NS3- 
NS4A-NS4B-NS5A-NS5B polypeptide substantially similar to SEQ. ID. NO. 1. The 
polypeptide can process itself to produce an enzymatically inactive NS5B protein. The 
gene expression cassette contains at least the following: 
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a) a promoter transcriptionally coupled to a nucleotide sequence 
encoding a polypeptide; 

b) a 5* ribosome binding site functionally coupled to the nucleotide 

sequence, 

5 c) a terminator joined to the 3 * end of the nucleotide sequence, and 

d) a V polyadenylation signal functionally coupled to the nucleotide 

sequence. 

Reference to "transcriptionally coupled" indicates that the promoter is 

positioned such that transcription of the nucleotide sequence can be brought about by 
10 RNA polymerase binding at the promoter. Transcriptionally coupled does not require 

that the sequence being transcribed is adjacent to the promoter. 

Reference to "functionally coupled" indicates the ability to mediate an 

effect on the nucleotide sequence. Functionally coupled does not require that the 

coupled sequences be adjacent to each other. A 3' polyadenylation signal functionally 
15 coupled to the nucleotide sequence facilitates cleavage and polyadenylation of the 

transcribed RNA. A 5' ribosome binding site functionally coupled to the nucleotide 

sequence facilitates ribosome binding. 

In preferred embodiments the nucleic acid is a DNA plasmid vector or 

an adeno vector suitable for either therapeutic application in treating HC V or as an 
20 intermediate in the production of a therapeutic vector. Treating HCV includes 

actively treating an HCV infection and prophylactically treating against an HCV 

infection. 

Another aspect of the present invention describes an adenovector 
comprising a Met-NS3-NS4A-NS4B-NS5A-NS5B expression cassette able to express 
25 a polypeptide substantially similar to SEQ. ID. NO. 1 that is produced by a process 
involving (a) homologous recombination and (b) adenovector rescue. The 
homologous recombinant step produces an adenovirus genome plasmid. The 
adenovector rescue step produces the adenovector from the adenogenome plasmid. 

Adenovirus genome plasmids described herein contain a recombinant 
30 adenovirus genome having a deletion in the El region and optionally in the E3 region 
and a gene expression cassette inserted into one of the deleted regions. The 
recombinant adenovirus genome is made of regions substantially similar to one or 
more adenovirus serotypes. 

Another aspect of the present invention describes an adenovector 
35 consisting of the nucleic acid sequence of SEQ. ID. NO. 4 or a derivative thereof, 
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wherein said derivative thereof has the HCV polyprotein encoding sequence present 
in SEQ. ID. NO. 4 replaced with the HCV polyprotein encoding sequence of either 
SEQ. ID. NO. 3, SEQ. ID. NO. 10 or SEQ. ID. NO. 11. 

Another aspect of the present invention describes a cultured 
5 recombinant cell comprising a nucleic acid containing a sequence encoding a Met- 
NS3-NS4A-NS4B-NS5A-NS5B polypeptide substantially similar to SEQ. ID. NO. 1. 
The recombinant cell has a variety of uses such as being used to replicate nucleic acid 
encoding the polypeptide in vector construction methods. 

Another aspect of the present invention describes a method of making 

10 an adenovector comprising a Met-NS3-NS4A-NS4B-NS5 A-NS5B expression cassette 
able to express a polypeptide substantially similar to SEQ. ID. NO. 1. The method 
involves the steps of (a) producing an adenovirus genome plasmid containing a 
recombinant adenovirus genome with deletions in the El and E3 regions and a gene 
expression cassette inserted into one of the deleted regions and (b) rescuing the 

15 adenovector from the adenovirus genome plasmid. 

Another aspect of the present invention describes a pharmaceutical 
composition comprising a vector for expressing a Met-NS3-NS4A-NS4B-NS5A^ 
NS5B polypeptide substantially similar to SEQ. ID, NO. 1 and a pharmaceutical^ 
acceptable carrier. The vector is suitable for administration and polypeptide 

20 expression in a patient. 

A "patient" refers to a mammal capable of being infected with HCV. 
A patient may or may not be infected with HCV. Examples of patients are humans v 
arid chimpanzees. 

Another aspect of the present invention describes a method of treating 

25 a patient comprising the step of administering to the patient ah effective amount bf a 
vector expressing a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide substantially 
similar to SEQ. ID. NO. 1. The vector is suitable for administration and polypeptide 
expression in the patient 

The patient undergoing treatment may or may not be infected with 

30 HCV. For a patient infected with HCV, an effective amount is sufficient to achieve 
one or more of the following effects: reduce the ability of HCV to replicate* reduce 
HCV load, increase viral clearance, and increase one or more HCV specific CMI 
responses. For a patient not infected with HCV, an effective amount is sufficient to 
achieve one or more of the following: an increased ability to produce one of more 

35 components of a HCV specific CMI response to a HCV infection, a reduced 
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susceptibility to HCV infection, and a reduced ability of the infecting virus to 
establish persistent infection for chronic disease. 

Another aspect of the present invention features a recombinant nucleic 
acid comprising an Ad6 region and a region not present in Ad6. Reference to 
: 5 "recombinant" nucleic acid indicates the presence of two or more nucleic acid regions 
not naturally associated with each other. Preferably, the Ad6 recombinant nucleic 
acid contains Ad6 regions and a gene expression cassette coding for a polypeptide 
heterologous to Ad6. 

Other features and advantages of the present invention are apparent 
10 from the additional descriptions provided herein including the different examples. 
The provided examples illustrate different components and methodology useful in 
practicing the present invention. The examples do not limit the claimed invention. 
Based on the present disclosure the skilled artisan can identify and employ other 
components and methodology useful for practicing the present invention. 

15 . 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figures 1A and IB illustrate SEQ. ED. NO. 1. 

Figures 2A, 2B, 2C, and 2D illustrate SEQ. ID. NO. 2. SEQ. ID. NO. 

2 provides a nucleotide sequence coding for SEQ. ID. NO. 1 along with an optimized 
20 internal ribosome entry site and TAAA termination. Nucleotides 1-6 provides an 

optimized internal ribosome entry site. Nucleotides 7-5961 code for a HCV Met- 
NS3-NS4A-NS4B-NS5A-NS5B polypeptide with nucleotides in positions 5137 to 
5145 providing a AlaAlaGly sequence in amino acid positions 1711 to 1713 that 
renders NS5B inactive. Nucleotides 5962-5965 provide a TAAA termination. 
25 Figures 3A, 3B, 3C, and 3D illustrate SEQ. ID. NO. 3. SEQ. ID. NO. 

3 is a codon optimized version of SEQ. ID. NO. 2. Nucleotides 7-5961 encode a 
HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide. 

/* Figures 4A-4M illustrate MRKAd6-NSmut (SEQ. ID. NO. 4). SEQ. 

ID. NO. 4 is an adenovector containing an expression cassette where the polypeptide 

30 of SEQ. ID. NO. 1 is encoded by SEQ. ID. NO. 2. Base pairs 1-450 correspond to the 
Ad5 bp 1 to 450; base pairs 462 to 1252 correspond to the human CMV promoter; 
base pairs 1258 to 1267 correspond to the Kozak sequence; base pairs 1264 to 7222 
correspond to the NS genes; base pairs 7231 to 7451 correspond to the BGH 
polyadenylation signal; base pairs 7469 to 9506 correspond to Ad5 base pairs 3511 to 

35 5548; base pairs 9507 to 32121 correspond to Ad6 base pairs 5542 to 28156; base 
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pairs 32122 to 35 1 17 correspond to Ad6 base pairs 30789 to 33784; and base pairs 
35118 to 37089 correspond to Ad5 base pairs 33967 to 35935. 

Figures 5A-SO illustrate SEQ. ID. NOs. 5 and 6. SEQ. ID. NO. 5 
encodes a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide with an active 
5 RNA dependent RNA polymerase. SEQ. ID. NO. 6 provides the amino acid sequence 
for the polypeptide. 

Figures 6 A-6C provide the nucleic acid sequence for pVlJnsA (SEQ, 

ID. NO. 7). 

Figures 7A-7N provide the nucleic acid sequence for the Ad6 genome 
10 (SEQ. ID. NO. 8). 

Figures 8 A-8K provide the nucleic acid sequence for the Ad5 genome 

(SEQ. ID. NO. 9). 

Figure 9 illustrates different regions of the Ad6 genome. The linear 

(35759 bp) ds DNA genome is indicated by two parallel lines and is divided into 100 
15 map units. Transcription units are shown relative to their position and orientation in 

the genome. Early genes (El A, E1B, E2A/B* E3 and E4 are indicated by gray arrows. 

Late genes (LI to L5) , indicated by black arrows, are produced by alternative splicing 

of a transcript produced from the major late promoter (MLP) and all contain the 

tripartite leader (1, 2, 3) at their 5* ends. The El region is located from approximately 
20 1.0 to 1 1 .5 map units, the E2 region from 75.0 to 11.5 map units, E3 from ,76. 1 to 86.7 

map units, and E4 from 99.5 to 91.2 map units. The major late transcription unit is 

located between 16.0 and 91.2 map units. 

Figure 10 illustrates homologous recombination to recover pAdEi-E3+ 

containing Ad6 and Ad5 regions. 
25 Figure 1 1 illustrates homologous recombinant to recover a pAdEl-E3+ 

containing Ad6 regions. 

Figure 12 illustrates a western blot on whole-cell extracts from 293 

cells transfected with plasmid DNA expressing different HCV NS cassettes. Mature 

NS3 and NS5A products were detected with specific antibodies. "pV Uns-NS" refers 
30 to a pVl JnsA plasmid where a Met-NS3rNS4 A-NS4B-NSS A-NS5B polypeptide is ! 

encoded by SEQ. ID. NO. 5, and SEQ. ED. NO. 5 is inserted between biases 1881 and 

1912 of SEQ. ED. NO. 7. "pVl Jns-NSmut" refers to a pVl JnsA plasmid where SEQ. 

ID, NO. 2 is inserted between bases 1882 and 1925 of SEQ. ID. NO. 7. "pVUns- f 

NSOPTmut" refers to a pVlJnsA plasmid where SEQ. ID. NO. 3 is inserted between 
35 bases 188 1 and 1905 of SEQ. ED. NO. 7. ; * , 
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Figures 13A and 13B illustrate T cell responses by IFNy EUspot 
induced in C57black6 mice (A) and BalbC mice (B) by two injections of 25jLig and 
50jig, respectively, of plasmid DNA encoding the different HCV NS cassettes with 
Gene Electro-Transfer (GET). 
5 Figure 14 illustrates protein expression from different adenovectors 

upon infection of HeLa cells. MRKAd5-NSmut is an adenovector based on an Ad5 
sequence (SEQ. ID. NO. 9), where the Ad5 genome has an El deletion of base pairs 
451 to 3510, an E3 deletion of base pairs 28134 to 30817, and has the NS3-NS4A- 
NS4B-NS5A-NS5B expression cassette as provided in base pairs 451 to 7468 of SEQ. 
10 ED. NO. 4 inserted between positions 450 and 351 1. Ad5-NS is an adenovector based 
on an Ad5 backbone with an El deletion of base pairs 342 to 3523, and E3 deletion of 
base pairs 28134 to 30817 and containing an expression cassette encoding a NS3- 
NS4A-NS4B-NS5A-NS5B from SEQ. ID. NO, 5. "MRKAd6-NSOPTmut" refers to 
an adenovector having a modified SEQ, ID. NO. 4 sequence, wherein base pairs 1258 
15 to 7222 of SEQ. ID. NO. 4 is replaced with SEQ. ID. NO. 3. 

Figure 15 illustrates T cell responses by IFNy EUspot induced in 
C57black6 mice by two injections of 10 9 vp of adenovectors containing different 
HCV non-structural gene cassettes. 

Figures 16A-16D illustrate T cell responses by IFNy ELIspot induced 
20 in Rhesus monkeys by one or two injections of 10 10 vp (A) or 10 l 1 vp (B) of f 
adenovectors containing different HCV non-structural gene cassettes. 

Figures 17A and 17B illustrates CD8+ T cell responses by IFNy ICS 
induced in Rhesus monkeys by two injections of 10 10 vp (A) or 10 u vp (B) of 
adenovectors encoding the different HCV non-structural gene cassettes. 
25 Figures 18A-18F illustrate T cell responses by bulk CTL assay induced 

in Rhesus monkeys by two injections of 10 ! 1 vp of Ad5-NS (A), MRKAdS-NSmut 
(B), or MRKAd6-NSmut (C). 

Figure 19 illustrates the plasmid pE2. 

Figures 20A-D illustrates the partial codon optimized sequence 
30 NSsuboptmut (SEQ. ID. NO. 10). Coding sequence for the Met-NS3-NS4A-NS4B- 
NS5 A-NS5B polypeptide is from base 7 to 596 1 . 
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DETAILED DESCRIPTION OF THE INVENTION 

The present invention features Ad6 vectors and nucleic acid encoding a 
Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide that contains an inactive NS5B 
region. Providing an inactive NS5B region supplies NS5B antigens while reducing the 
5 possibility of adverse side effects due to an active viral RNA polymerase. Uses of the 
featured nucleic acid include use as a vaccine component to introduce into a cell an 
HCV polypeptide that provides a broad range of antigens for generating a CMI 
response against HCV, and as an intermediate for producing such a viaccine 
component. 

10 The adaptive cellular immune response can function to recognize viral 

antigens in HCV infected cells throughout the body due to the ubiquitous distribution 
of major histocompatibility complex (MHC) class I and II expression, to induce 
immunological memory, and to maintain immunological memory. These functions 
are attributed to antigen-specific CD4+ T helper (Th) and CD8+ cytotoxic T cells 

15 (CTL). 

Upon activation via their specific T cell receptors, HCV specific Th 
cells fulfill a variety of immunoregulatory functions, most of them mediated by Thl 
and Th2 cytokines. HCV specific Th cells assist in the activation and differentiation* 
of B cells and induction and stimulation of virus-specific cytotoxic T cells. Together 
20 with CTL, Th cells may also secrete IFN-y and TNF-oc that inhibit replication arid 

gene expression of several viruses. Additionally, Th cells and CTL, the main effectbr 
cells, can induce apoptosis and lysis of virus infected cells. 

HCV specific CTL are generated from antigens processed by 
professional antigen presenting cells (pAPCs). Antigens can be either synthesized 
25 within or introduced into pAPCs. Antigen synthesis in a pAPC can be brought about 
by introducing into the cell an expression cassette encoding the antigen. 

A preferred route of nucleic acid vaccine administration is an 
h intramuscular route. Intramuscular administration appears to result in the introduction 
and expression of nucleic acid into somatic cells and pAPCs. HCV antigens produced 
30 in the somatic cells can be transferred to pAPCs for presentation in the context of 
MHC class I molecules. (Donnelly et al, Annu. Rev. Immunol /5.617-648, 1997.) 

pAPCs process longer length antigens into smaller peptide antigens in 
the proteasome complex. The antigen is translocated into the endoplasmic 
reticulum/Golgi complex secretory pathway for association with MHC class I 



9 



WO 03/031588 



PCT/US02/32512 



proteins. CD8+ T lymphocytes recognize antigen associated with class I MHC via the 
T cell receptor (TCR) and the CD8 cell surface protein. 

Using a nucleic acid encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide as a vaccine component allows for production of a broad range of 

5 antigens capable of generating CMI responses from a single vector. The polypeptide 
should be able to process itself sufficiently to produce at least a region corresponding 
to NS5B. Preferred nucleic acids encode an amino acid sequence substantially similar 
to SEQ. ID. NO. 1 that has sufficient protease activity to process itself to produce 
individual HCV polypeptides substantially similar to the NS3, NS4A, NS4B, NS5A, 

10 and NS5B regions present in SEQ. ID. NO. 1. 

A polypeptide substantially similar to SEQ. ID. NO. 1 with sufficient 
protease activity to process itself in a cell provides the cell with T cell epitopes that 
are present in several different HCV strains. Protease activity is provided by NS3 and 
NS3/NS4A proteins digesting the Met-NS3-NS4A-NS4B-NS5 A-NS5B polypeptide at 

15 the appropriate cleavage sites to release polypeptides corresponding to NS3, NS4A, 
NS4B, NS5A, and NS5B. Self- processing of the Met-NS3-NS4A-NS4B-NS5A- 
NS5B generates polypeptides that approximate naturally occurring HCV polypeptides. 

Based on the guidance provided herein a sufficiently strong immune 
response can be generated to achieve beneficial effects in a patient. The provided 

20 guidance includes information concerning HCV sequence selection, vector selection, 
vector production, combination treatment, and administration. 

I. HCV SEQUENCES 
A variety of different nucleic acid sequences can be used as a vaccine 
25 component to supply a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide to a 
cell or as an intermediate to produce vaccine components. The starting point for 
obtaining suitable nucleic acid sequences are preferably naturally occurring NS3- 
'* NS4A-NS4B-NS5A-NS5B polypeptide sequences modified to produce an inactive 
NS5B. 

30 The use of a HCV nucleic acid sequence providing HCV non-structural 

antigens to generate a CMI response is mentioned by Cho et cd., Vaccine 17:1 136- 
1 144, 1999, Paliard et al t International Publication Number WO 01/30812 (not 
admitted to be prior art to the claimed invention), and Coit et aL, International 
Publication Number WO 01/38360 (not admitted to be prior art to the claimed 

35 invention). Such references fail to describe, for example, a polypeptide that processes 
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itself to produce an inactive NS5B, and the particular combinations of HCV 
sequences and delivery vehicles employed herein. ^ '■** 

Modifications to a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide sequence can be produced by altering the encoding nucleic acid. 
5 Alterations can be performed to create deletions, insertions and substitutions. 

Small modifications can be made in NS5B to produce an inactive 
polymerase by targeting motifs essentially for replication. Examples of motifs critical 
for NS5B activity and modifications that can be made to produce an inactive NS5B 
are described by Lohmann etal, Journal of Virology 7i:8416-8426, 1997, and 
10 Kolykhalov etal, Journal of Virology 74:2046^2051, 2000. 

Additional factors to take into account when producing modifications 
to a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide include maintaining the 
ability to self-process and maintaining T cell antigens. The ability of the HCV 
polypeptide to process itself is determined to a large extent by a functional NS3 
15 protease. Modifications that maintain NS3 activity protease activity can be obtained 
by taking into account the NS3 protein, NS4A which serves as a cofactor for NS3, and 
NS3 protease recognition sites present within the NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide. 

Different modifications can be made to naturally occurring NS3- 
20 NS4A-NS4B-NS5A-NS5B polypeptide sequences to produce polypeptides able to 

elicit a broad range of T cell responses. Factors influencing the ability of a 

polypeptide to elicit a broad T cell response include the preservation or introduction 

of HCV specific T cell antigen regions and prevalence of different T cell antigen 

regions in different HCV isolates. 
25 Numerous examples of naturally occurring HCV isolates are well 

known in the art. HCV isolates can be classified into the following six major 

genotypes comprising one or more subtypes: HCV-l/(la,lb,lc), HCV-2/(2a,2b,2c), 
" HCV-3/(3a,3b,10a), HCV-4/(4a), HCV-5/(5a) and HCV-6/(6a,6b,7b,8b,9a,lla). z 

(Simmonds, 7. Gen. Virol, 693-712, 2001.) Examples of particular HCV sequences 
30 such as HCV-BK, HCV- J, HCV-N, HCV-H, have been deposited in GenBank and . . 

described in various publications. (See, for example, Chamberlain et ah; J. Gen. 

Virol., 1341-1347, 1997.) 

HCV T cell antigens can be identified by, for example, empirical 

experimentation. One way of identifying T cell antigens involves generating a series 
35 of overlapping short peptides from a longer length polypeptide and then screening the 
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T-cell populations from infected patients for positive clones. Positive clones are 
activated/primed by a particular peptide. Techniques such as IFNy-ELISPOT, EFNy- 
Intracellular staining and bulk CTL assays can be used to measure peptide activity. 
Peptides thus identified can be considered to represent T-cell epitopes of the 
5 respective pathogen. 

HCV T cell antigen regions from different HCV isolates can be 
introduced into a single sequence by, for example, producing a hybrid NS3-NS4A- 
NS4B-NS5 A-NS5B polypeptide containing regions from two or more naturally 
occurring sequences. Such a hybrid can contain additional modifications, which 
10 preferably do not reduce the ability of the polypeptide to produce an HCV CMI 
response. 

The ability of a modified Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide to process itself and produce a CMI response can be determined using 
techniques described herein or well known in the art. Such techniques include the use 
15 of IFNy-ELISPOT, IFNy-Intracellular staining and bulk CTL assays to measure a 
HCV specific CMI response. 

A. Met-NS3-NS4A-NS4B-NS5A-NS5B Sequences 

SEQ. ID. NO. 1 provides a preferred Met-NS3-NS4A-NS4B-NS5A- 
20 NS5B sequence. SEQ. ID. NO. 1 contains a large number of HCV specific T cell 

antigens that are present in several different HCV isolates. SEQ. ID. NO. 1 is similar 
to the NS3-NS4A-NS4B-NS5A-NS5B portion of the HCV BK strain nucleotide 
sequence (GenBank accession number M58335). 

In SEQ. ID. NO. 1 anchor positions important for recognition by MHC 
25 v class I molecules are conserved or represent conservative substitutions for 18 out of 
20 known T-cell epitopes in the NS3-NS4A-NS4B-NS5 A-NS5B portion of HCV 
polyproteins. With respect to the remaining two known T-cell epitopes, one has a 
non-conservative anchor substitution in SEQ. ID. NO. 1 that may still be recognized 
by a different HLA supertype and one epitope has one anchor residue not conserved. 
30 HCV T-cell epitopes are described in Chisari et al, Curr. Top. Microbiol Immunol., 
242:299-325, 2000, and Lechner et al. J. Exp. Meet 9:1499-1512, 2000. 

Differences between the HCV-BK NS3-NS4A-NS4B-NS5A-NS5B 
nucleotide sequence and SEQ. ID. NO. 1 include the introduction of a methionine at 
the 5' end and the presence of modified NS5B active site residues in SEQ. ID. NO. 1. 
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The modification replaces GlyAspAsp with AlaAlaGly (residues 171 1-1713) to 
inactivate NS5B. 

The encoded HCV Met-NS3-NS4A-NS4B-NS5 A-NS5B polypeptide 
preferably has an amino acid sequence substantially similar to SEQ. ID. NO. L In 
5 different embodiments, the encoded HCV Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide has an amino acid identify to SEQ. ID. NO. 1 of at least 65%, at least 
75%, at least 85%, at least 95%, at least 99% or 100%; or differs from SEQ. ID. NO. 
1 by 1-2, 1-3, 1-4, 1-5, 1-6, 1^7, 1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 1-14, 1-15, 1-16, 1- 
17, 1-18, 1-19, or 1-20 amino acids. 

10 Amino acid differences between a Met-NS3-NS4A-NS4B-NS5A- 

NS5B polypeptide and SEQ. ID. NO. 1 are calculated by determining the minimum 
number of amino acid modifications in which the two sequences differ. Amino acid 
modifications can be deletions, additions, substitutions or any combination thereof. 

Amino acid sequence identity is determined by methods well known in 

15 the art that compare the amino acid sequence of one polypeptide to the amino acid 
sequence of a second polypeptide and generate a sequence alignment. Amino acid 
identity is calculated from the alignment by counting the number of aligned residue 
pairs that have identical amino acids. 

Methods for determining sequence identity include those described by 

20 Schuler, G.D . in Bioinformatics: A Practical Guide to the Analysis of Genes and 

Proteins, Baxevanis, A.D. and Ouelette* B.F.F., eds;, John Wiley & Sons, Inc, 2001; 
Yoria, et ai, in Bioinformatics: Sequence, structure and databanks, Higgins, D. and 
Taylor* W. eds, Oxford University Press, 2000; and Bioinformatics: Sequence and 
Genome Analysis, Mount, D.W., ed M Cold Spring Harbor Laboratory Press, 2001). 

25 Methods to determine amino acid sequence identity are codified in publicly available 
computer programs such as GAP (Wisconsin Package Version 10.2, Genetics 
Computer Group (GCG), Madison, Wise), BLAST (Altschul et a/., 7. Mot. Biol 
2/5(3J:403-10, 1990), and FASTA (Pearson, Methods in Enzymology 7*3:63-98, 
1990, R.R Doolittle, ed.). 

30 In an embodiment of the present invention sequence identity between' 

two polypeptides is determined using the GAP program (Wisconsin Package Version 
10,2, Genetics Computer Group (GCG), Madison, Wise). GAP uses the alignment 
method of Needleman and Wunsch. (Needleman, et aL, J. MoL Biol. 45:443-453, 
1970.) GAP considers all possible alignments and gap positions between two 

35 sequences and creates a global alignment that maximizes the number of matched 
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residues and minimizes the number and size of gaps. A scoring matrix is used to 
assign values for symbol matches. In addition, a gap creation penalty and a gap 
extension penalty are required to limit the insertion of gaps into the alignment 
Default program parameters for polypeptide comparisons using GAP are the 
5 BLOSUM62 (Henikoff et al., Proc. Natl AcadL Scl USA, 89:10915-10919, 1992) 
amino acid scoring matrix (MATrix=blosum62.cmp), a gap creation parameter 
(GAPweight=8) and a gap extension pararameter (LENgthweight=2). 

More preferred HCV Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptides in addition to being substantially similar to SEQ. ID. NO. 1 across their 
10 entire length produce individual NS3, NS4A, NS4B, NS5A and NS5B regions that are 
substantially similar to the corresponding regions present in SEQ. ID. NO. 1. The 
corresponding regions in SEQ. ID. NO. 1 are provided as follows: Met-NS3 amino 
acids 1-632; NS4A amino acids 633-686; NS4B amino acids 687-947; NS5A amino 
acids 948-1394; and NS5B amino acids 1395-1985. 
15 In different embodiments a NS3, NS4A, NS4B, NS5 A and/or NS5B 

region has an amino acid identity to the corresponding region in SEQ. ID. NO. 1 of at 
least 65%, at least 75%, at least 85%, at least 95%, at least 99%, or 100%; or an 
amino acid difference of 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 
1-14, 1-15, 1-16, 1-17, 1-18, 1-19, or 1-20 amino acids. 
20 Amino acid modifications to SEQ. ID. NO. 1 preferably maintain all or 

most of the T-cell antigen regions. Differences in naturally occurring amino acids are 
due to different amino acid side chains (R groups). An R group affects different 
properties of the amino acid such as physical size, charge, and hydrophobicity. 
Amino acids can be divided into different groups as follows: neutral and hydrophobic 
25 (alanine, valine, leucine, isoleucine, proline, tyrptophan, phenylalanine, and 
methionine); neutral and polar (glycine, serine, threonine, tryosine, cysteine, 
asparagine, and glutamine); basic (lysine, arginine, and histidine); and acidic (aspartic 
A acid and glutamic acid). 

Generally, in substituting different amino acids it is preferable to 
30 exchange amino acids having similar properties. Substituting different amino acids 
within a particular group, such as substituting valine for leucine, arginine for lysine, 
and asparagine for glutamine are good candidates for not causing a change in 
polypeptide tertiary structure. 

Starting with a particular amino acid sequence and the known 
35 degeneracy of the genetic code, a large number of different encoding nucleic acid 
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sequences can be obtained. The degeneracy of the genetic code arises because almost 

all amino acids are encoded by different combinations of nucleotide triplets or 

"codons". The translation of a particular codon into a particular amino acid is well 

known in the art (see, e.g., Lewin GENES IV t p. 1 19, Oxford University Press, 1990). 
5 Amino acids are encoded by codons as follows: 

A=Ala=Alanine: codons GCA, GCC, GCG, GCU 

C=Cys=Cysteine: codons UGC, UGU 

D=Asp=Aspartic acid: codons GAC, GAU 

E=Glu=Glutamic acid: codons GAA, GAG 
10 F=Phe=Phenylalanine: codons UUC, UUU 

G=Gly=Glycine: codons GGA, GGC, GGG, GGU 

H=His=Histidihe: codons C AG, CAU 

I=Ile=Isoleucine: codons AUA, AUC, AUU 

K=Lys=Lysine: codons AAA, AAG 
15 L=Leu=Leucine: codons UUA, UUG, CUA, CUC, CUG, CUU 

M=Met==Methionine: codon AUG 

N=Asn=Asparagine: codons AAC, AAU 

P=Prd=Proline: codons CCA, CCC, CCG, CCU 

Q=GIri=Glutamine: codons CAA, CAG 
20 R=Arg=Arginine: codons AG A, AGG, CG A, CGC, CGG, CGU 

S=Sen=Serine: codons AGC, AGU, UCA, UCC, UCG, UCU 

T=Thr=Threonine: codons ACA, ACC, ACG, ACU 

V=Val=Valirie: codons GUA, GUC, GUG, GUU 

W=Trp=Tryptophan: codon UGG 
25 Y=Tyr=Tyrosine: codons UAC, UAU. 

Nucleic acid sequences can be optimized in an effort to enhance 

expression in a host. Factors to be considered include C:G content, preferred cpdoiis, 

and the avoidance of inhibitory secondary structure. The^e factors can be combined 

in different ways in an attempt to obtain nucleic acid sequences having enhanced 
30 expression in at particular host. (See, for example, Donnelly et at* International 

Publication Number WO 97/47358.) 

The ability of a particular sequence to have enhanced expression in a ; 

particular host involves some empirical experimentation. Such experimentation 

involves measuring expression of a prospective nucleic acid sequence and, if needed, 
35 altering the sequence. ; 
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B. Encoding Nucleotide Sequences 

SEQ. ID. NOs. 2 and 3 provide two examples of nucleotide sequences 

encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B sequence. The coding sequence of 
5 SEQ. ID. NO. 2 is similar (99.4% nucleotide sequence identity) to the NS3-NS4A- 

NS4B-NS5 A-NS5B region of the naturally occurring HCV-BK sequence (GenBank 

accession number M58335). SEQ. ID. NO. 3 is a codon-optimized version of SEQ. 

ID. NO. 2. SEQ. ID. NOs. 2 and 3 have a nucleotide sequence identity of 78.3%. 

Differences between the HCV-BK NS3-NS4A-NS4B-NS5A-NS5B 
iO nucleotide (GenBank accession number M58335) and SEQ. ED. NO. 2, include SEQ. 

ID. NO. 2 having a ribosome binding site, an ATG methionine codon, a region coding 

for a modified NS5B catalytic domain, a TAAA stop signal and an additional 30 

nucleotide differences. The modified catalytic domain codes for a AlaAlaGly 

(residues 171 1-1713) instead of GlyAspAsp to inactivate NS5B. 
15 A nucleotide sequence encoding a HCV Met-NS3-NS4A-NS4B- 

NS5A-NS5B polypeptide is preferably substantially similar to the SEQ. ID. NO. 2 

coding region. In different embodiments, the nucleotide sequence encoding a HCV 

Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide has a nucleotide sequence identify 

to the SEQ. ID. NO. 2 coding region of at least 65%, at least 75%, at least 85%, at 
20 least 95%, at least 99%, or 100%; or differs from SEQ. ID. NO. 2 by 1-2, 1-3, 1-4, 1- 

5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 1-14, 1-15, 1-16, 1-17, 1-18, 1-19, 1-20, 

1-25, 1-30, 1-35, 1-40, 1-45, or 1-50 nucleotides. 

Nucleotide differences between a sequence coding Met-NS3-NS4A- 

NS4B-NS5 A-NS5B and the SEQ. ID. NO. 2 coding region are calculated by 
25 determining the minimum number of nucleotide modifications in which the two 

sequences differ. Nucleotide modifications can be deletions, additions, substitutions 

or any combination thereof. * 

Nucleotide sequence identity is determined by methods well known in 

the art that compare the nucleotide sequence of one sequence to the nucleotide 
30 sequence of a second sequence and generate a sequence alignment. Sequence identity 

is determined from the alignment by counting the number of aligned positions having 

identical nucleotides. 

Methods for determining nucleotide sequence identity between two 

polynucleotides include those described by Schuler, in Bioinformatics: A Practical 
35 Guide to the Analysis of Genes and Proteins, Baxevanis, A.D. and Ouelette, B.F.F., 
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eds. , John Wiley & Sons, Inc, 200 1 ; Yona et al t . in Bioinformatics: Sequence, 
structure and databanks, Higgins, D. and Taylor, W. eds, Oxford University Press, 
2000; and Bioinformatics: Sequence and Genome Analysis, Mount, D.W., ed., Cold 
Spring Harbor Laboratory Press, 2001). Methods to determine nucleotide sequence 
5 identity are codified in publicly available computer programs such as GAP 

(Wisconsin Package Version 10.2, Genetics Computer Group (GCG), Madison, 
Wise), BLAST (Altschul et aU J- Mol Biol 215(3):403A0; 1990), and FASTA 
(Pearson, W.R., Methods in Enzymology 785:63-98, 1990, R.F. Doolittle, ed.). 

In an embodiment of the present ivnention, sequence identity between 

10 two polynucleotides is determined by application of GAP (Wisconsin Package 
Version 10.2, Genetics Computer Group (GCG), Madison, Wise). GAP uses the 
alignment method of Needleman and Wunsch. (Needleman et al, J. Mol Biol 
48:443-453, 1970.) GAP considers all possible alignments and gap positions between 
two sequences and creates a global alignment that maximizes the number of matched 

15 residues and minimizes the number and size of gaps. A scoring matrix is used to 
assign values for symbol matches. In addition, a gap creation penalty and a gap 
extension penalty are required to limit the insertion of gaps into the alignment. 
Default program parameters for polynucleotide comparisons using GAP are the 
nwsgapdnaxmp scoring matrix (MATrix=nwsgapdna.cmp), a gap creation parameter 

20 (GAPweight=50) and a gap extension pararameter (LENgthweight=3); Y 

More preferred HCV Met-NS3-NS4A-NS4B-NS5A-NS5B nucleotide 
sequences in addition to being substantially similar across its entire length, produce 
individual NS3, NS4A, NS4B, NS5A and NS5B regions that are substantially similar 
to the corresponding regions present in SEQ. ID. NO. 2. The corresponding coding 

25 regions in SEQ. ID. NO. 2 arc provided as follows: Met-NS3, nucleotides 7-1902F * 
NS4A nucleotides 1903-2064; NS4B nucleotides 2065-2847; NS5A nucleotides 
2848-4188: NS5B nucleotides 4189-5661. 

In different embodiments a NS3, NS4A, NS4B, NS5A and/or NS5B 
encoding region has a nucleotide sequence identity to the corresponding region in 

30 SEQ. ID. NO. 2 of at least 65%, at least 75%, at least 85%, at least 95%, at least 99% 
or 100%; or a nucleotide difference to SEQ. ID. NO. 2 of 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 
1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 1-14, 1-15, 1-16, 1-17, 1-18, 1-19, 1-20, 1-25, 1-30, 
1-35, 1-40, 1-45, or 1-50 nucleotides. 

35 " 
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C. Gene Expression Cassettes 

A gene expression cassette contains elements needed for polypeptide 
expression. Reference to "polypeptide" does not provide a size limitation and 
includes protein. Regulatory elements present in a gene expression cassette generally 
5 include: (a) a promoter transcriptionally coupled to a nucleotide sequence encoding 
the polypeptide, (b) a 5' ribosome binding site functionally coupled to the nucleotide 
sequence, (c) a terminator joined to the 3' end of the nucleotide sequence, and (d) a 3' 
polyadenylation signal functionally coupled to the nucleotide sequence. Additional 
regulatory elements useful for enhancing or regulating gene expression or polypeptide 

10 processing may also be present. 

Promoters are genetic elements that are recognized by an RNA 
polymerase and mediate transcription of downstream regions. Preferred promoters 
are strong promoters that provide for increased levels of transcription. Examples of 
strong promoters are the immediate early human cytomegalovirus promoter (CMV), 

15 and CMV with intron A. (Chapman et al NucL Acids Res. 19:3979-3986, 1991.) 
Additional examples of promoters include naturally occurring promoters such as the 
EF1 alpha promoter, the murine CMV promoter, Rous sarcoma virus promoter, and 
S V40 early/late promoters and the p-actin promoter; and artificial promoters such as a 
synthetic muscle specific promoter and a chimeric muscle-specific/CMV promoter (Li 

20 et al, Nat. BiotechnoL 77:241-245, 1999, Hagstrom et al, Blood 95 ':2536-2542, 
2000). 

The ribosome binding site is located at or near the initiation codon. 
Examples of preferred ribosome binding sites include CCACC AUGG, 
CCGCCAUGG, and ACCAUGG, where AUG is the initiation codon. (Kozak, Cell 
25 44:283-292, 1986). Another example of a ribosome binding site is GCCACCAUGG 
(SEQ. ID. NO. 12). 

The polyadenylation signal is responsible for cleaving the transcribed 
RNA and the addition of a poly (A) tail to the RNA. The polyadenylation signal in 
higher eukaryotes contains an AAUAAA sequence about 11-30 nucleotides from the 
30 polyadenylation addition site. The AAUAAA sequence is involved in signaling RNA 
cleavage. (Lewin, Genes IV, Oxford University Press, NY, 1990.) The poly (A) tail 
is important for the mRNA processing. 

Polyadenylation signals that can be used as part of a gene expression 
cassette include the minimal rabbit p -globin polyadenylation signal and the bovine 
35 growth hormone polyadenylation (BGH). (Xu et al, Gene 272: 149-156, 2001, Post et 



18 



WO 03/031588 



PCT/US02/32512 



aU U.S. Patent U. S. 5,122,458.) Additional examples include the Synthetic 
Polyadenylation Signal (SPA) and SV40 polyadenylation signal. The SPA sequence . 
is as follows: AAUAAAAGAUCUUUAUUUUCAUUAGAUCUGUGUG 
UUGGUUUUUUGUGUG (SEQ. ID. NO. 13). 
5 Examples of additional regulatory elements useful for enhancing or 

regulating gene expression or polypeptide processing that may be present include an 
enhancer, a leader sequence and an operator. An enhancer region increases 
transcription. Examples of enhancer regions include the CMV enhancer and the 
S V40 enhancer. (Hitt et al t Methods in Molecular Genetics 7: 13-30, 1995, Xu, et al t 

10 Gene 272\ 149-156, 2001.) An enhancer region can be associated with a promoter. 

A leader sequence is an amino acid region on a polypeptide that directs 
the polypeptide into the proteasome. Nucleic acid encoding the leader sequence is 5\ 
of a structural gene and is transcribed along the structural gene. An example of a 
leader sequences is tP A. 

15 An operator sequence can be used to regulate gene expression. For 

example, the Tet operator sequence can be used to repress gene expression. 

n. THERAPEUTIC VECTORS 
Nucleic acid encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B 
20 polypeptide can be introduced into a patient using vectors suitable for therapeutic 
administration. Suitable vectors can deliver nucleic acid into a target cell without 
causing an unacceptable side effect. 

Cellular expression is achieved using a gene expression cassette 
encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide. The gene expression 
25 cassette contains regulatory elements for producing and processing a sufficient 
amount of nucleic acid inside a target cell to achieve a beneficial effect. 

Examples of vectors that can be used for therapeutic applications 
include first and second generation adenovectors, helper dependent adenovectors, 
adeno-associated viral vectors, retroviral vectors* alpha virus vectors, Venezuelan 
30 Equine Encephalitis virus vector, and plasmid vectors. (Hitt, et dl, Advances in 
Pharmacology 40: 137-206, 1997, Johnston eiaL, U.S. Patent No. 6,156,588, and 
Johnston et al, International Publication Number WO 95/32733.) Preferred vectors 
for introducing a Met-NS3-NS4A-NS4B-NS5A^NS5B polypeptide into a subject are 
first generation adenoviral vectors and plasmid DNA vectors. 

35 
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A. First Generation Adenovcctors 

First generation adenovector for expressing a gene expression cassette 

contain the expression cassette in an El and optionally E3 deleted recombinant 
adenovirus genome. The deletion in the El region is sufficiently large to remove 
5 elements needed for adenoviral replication. 

First generation adeno vectors for expressing a Met-NS3-NS4 A-NS4B- 
NS5A-NS5B polypeptide contain a El and E3 deleted recombinant adenovirus 
genome. The deletion in the El region is sufficiently large to remove elements 
needed for adenoviral replication. The combinations of deletions of the El and E3 
10 regions are sufficiently large to accommodate a gene expression cassette encoding a 
Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide. 

The adenovirus has a double-stranded linear genome with inverted 
terminal repeats at both ends. During viral replication, the genome is packaged inside 
a viral capsid to form a virion. The virus enters its target cell through viral attachment 
15 followed by internalization. (Hitt et al., Advances in Pharmacology 40: 137-206, 
1997.) 

Adenovectors can be based on different adenovirus serotypes such as 
those found in humans or animals. Examples of animal adenoviruses include bovine, 
porcine, chimp, murine, canine, and avian (CELO). Preferred adenovectors are based 
20 on human serotypes, more preferably Group B, C, or D serotypes. Examples of 

human adenovirus Group B, C, D, or E serotypes include types 2 ("Ad2"), 4 ("Ad4"), 
5 ("Ad5"), 6 ("Ad6"), 24 C'Ad24"), 26 ("Ad26"), 34 ("Ad34") and 35 CAd35"). 
Adenovectors can contain regions from a single adenovirus or from two or more 
adenovirus. 

25 In different embodiments adenovectors are based on Ad5, Ad6, or a 

combination thereof. Ad5 is described by Chroboczek, et al, /. Virology 186:2$Q- 
285, 1992. Ad6 is described in Figures 7A-7N. An Ad6 based vector containing 
Ad5 regions is described in the Example section provided below. 

Adenovectors do not need to have their El and E3 regions completely 
30 removed. Rather, a sufficient amount the El region is removed to render the vector 
replication incompetent in the absence of the El proteins being supplied in trans; and 
the El deletion or the combination of the El and E3 deletions are sufficiently large 
enough to accommodate a gene expression cassette. 

El deletions can be obtained starting at about base pair 342 going up to 
35 about base pair 3523 of Ad5, or a corresponding region from other adenoviruses. 
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Preferably, the deleted region involves removing a region from about base pair 450 to 
about base pair 351 1 of Ad5, or a corresponding region from other adenoviruses. 
Larger El region deletions starting at about base pair 341 removes elements that 
facilitate virus packaging. 
5 E3 deletions can be obtained starting at about base pair 27865 to about 

base pair 30995 of Ad5, or the corresponding region of other adeno vectors. 
Preferably the deletion region involves removing a region from about base pair 28 134 
up to about base pair 30817 of Ad5, or the corresponding region of other 
adeno vectors. 

10 The combination of deletions to the El region and optionally the E3 

region should be sufficiently large so that the overall size of the recombinant genome 
containing the gene expression cassette does not exceed about 105% of the wild type 
adenovirus genome* For example, as recombinant adeno virus Ad5 genomes increase 
size above about 105% the genome becomes unstable. (Bett et al, Journal of 

15 Virology (57:5911-5921, 1993.) 

Preferably, the size of the recombinant adenovirus genome containing 
the gene expression cassette is about 85% to about 105% the size of the wild type 
adenovirus genome. In different embodiments, the size of the recombinant 
adenovirus genome containing the expression cassette is about 100% to about 

20 105.2%, or about 100%, the size of the wild type genome. 

Approximately 7,500 kb can be inserted into an adenovirus genome 
with a El and E3 deletion. Without any deletion, the Ad5 genome is 35,935 base 
pairs and the Ad6 genome is 35,759 base pairs. 

Replication of first generation adenovectors can be performed by 

25 supplying the El gene products in trans. The El gene product can be supplied in 

trans, for example, by using cell lines that have been transformed with the adenovirus 
El region. Examples of cells and cells lines transformed with the; adenovirus El 
region are HEK 293 cells, 911 cells, PERC.6™ cells, and tfarisfected primary humkn 
amiribcytes cells. (Graham et al, Journal of Virology 55:59-72, 1977, Schiedner ei 

30 al, Human Gene Therapy 77:2105-2116, 2000, Fallaux et al, Human Gene Therapy 
9: 1909-1917, 1998, Bout et al, U.S. Patent No. 6,033,908.) 

A Met-NS3-NS4A-NS4B-NS5 A-NS5B expression cassette should be 
inserted into a recombinant adenovirus genome in the region corresponding to the 
deleted El region or the deleted E3 region. The expression cassette can have a 

35 parallel or anti-parallel orientation. In a parallel orientation the transcription direction 
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of the inserted gene is the same direction as the deleted El or E3 gene. In an anti- 
parallel orientation transcription the opposite strand serves as a template and the 
transcription direction is in the opposite direction. 

In an embodiment of the present invention the adenovector has a gene 
5 expression cassette inserted in the El deleted region. The vector contains: 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 

b) a gene expression cassette in a El parallel or El anti-parallel 
orientation joined to the first region; 

10 c) a second adenovirus region from about base pair 3511 to about 

base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to the expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 

15 28156 corresponding to Ad6, joined to the second region; 

e) a fourth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to the third region; and 

f) a fifth adenovirus region from about base pair 33967 to about 
20 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

pair 35759 corresponding to Ad6 joined to the fourth region. 

In another embodiment of the present invention the adenovector has an 
expression cassette inserted in the E3 deleted region. The vector contains: 

a) a first adenovirus region from about base pair 1 to about base 
25 pair 450 corresponding to either Ad5 or Ad6; 

b) a second adenovirus region from about base pair 3511 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to the first region; 

c) a third adenovirus region from about base pair 5549 to about 
30 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 

28 156 corresponding to Ad6, joined to the second region; 

d) a gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to the third region; 
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e) a fourth adenovirus region from about base pair 3081 8 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to the gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
5 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

pair 35759 corresponding to Ad6, joined to the fourth region. 

In preferred different embodiments concerning adenovirus regions that 
are present: (1) the first, second, third, fourth, and fifth region corresponds to Ad5; (2) 
the first, second, third, fourth, and fifth region corresponds to Ad6; and (3) the first 
10 region corresponds to Ad5, the second region corresponds to Ad5, the third region 
corresponds to Ad6, the fourth region corresponds to Ad6, and the fifth region 
corresponds to Ad5. 

B. DNA Plasmid Vectors 
15 DNA vaccine plasmid vectors contain a gene expression cassette along 

with elements facilitating replication and preferably vector selection. Preferred 

elements provide for replication in non-mammalian cells and a selectable marker. 

The vectors should not contain elements providing for replication in human cells 6r 

for integration into human nucleic acid. 
20 The selectable marker facilitates selection of nucleic acids containing 

the marker. Preferred selectable markers are those that confer antibiotic resistance. 

Examples of antibiotic selection genes include nucleic acid encoding resistance to 

ampicillin, neomycin, and kanamycin. 

Suitable DNA vaccine vectors can be produced starting with a plasmid 
25 containing a bacterial origin of replication and a selectable marker. Examples of 

bacterial origins of replication providing for higher yields include the ColEl plasmid- 

derived bacterial origin of replication. (Donnelly etal^ Antiu. Rev. Immunol. i 5:617- 

648,1997.) 

The presence of the bacterial origin of replication and selectable 
30 marker allows for the production of the DNA vector in a bacterial strain such as E. 
coll The selectable marker is used to eliminate bacteria hot containing the DNA 
vector. 
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m. AD6 RECOMBINANT NUCLEIC ACID 
Ad6 recombinant nucleic acid comprises an Ad6 region substantially 
similar to an Ad6 region found in SEQ. ED. NO. 8, and a region not present in Ad6 
nucleic acid. Recombinant nucleic acid comprising Ad6 regions have different uses 
5 such as in producing different Ad6 regions, as intermediates in the production of Ad6 
based vectors, and as a vector for delivering a recombinant gene. 

As depicted in Figure 9, the genomic organization of Ad6 is very 
similar to the genomic organization of Ad5. The homology between Ad5 and Ad6 is 
approximately 98%. 

10 In different embodiments, the Ad6 recombinant nucleic acid comprises 

a nucleotide region substantially similar to El A, E1B, E2B, E2A, E3, E4, LI, L2, L3, 
or L4, or any combination thereof. A substantially similar nucleic acid region to an 
Ad6 region has a nucleotide sequence identity of at least 65%, at least 75%, at least 
85%, at least 95%, at least 99% or 100%; or a nucleotide difference of 1-2, 1-3, 1-4, 

15 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 1-14, 1-15, 1-16, 1-17, 1-18, 1-19, 1-20, 
1-25, 1-30, 1-35, 1-40, 1-45, or 1-50 nucleotides. Techniques and embodiments for 
determining substantially similar nucleic acid sequences are described in Section LB. 
supra. 

Preferably, the recombinant Ad6 nucleic acid contains an expression 
20 cassette coding for a polypeptide not found in Ad6. Examples of expression cassettes 
include those coding for HCV regions and those coding for other types of 
polypeptides. 

Different types of adenoviral vectors can be produced incorporating 
different amounts of Ad6, such as first and second generation adenovectors. As noted 
25 in Section H.A. supra, first generation adenovectors are defective in El and can 
replicate when El is supplied in trans. 

Second generation adenovectors contain less adenoviral genome than 
first generation vectors and can be used in conjugation with complementing cell lines 
and/or helper vectors supplying adenoviral proteins. Second generation adenovectors 
30 arc described in different references such as Russell, Journal of General Virology 
81 :2573-2604, 2000; Hitt et al, 1997, Human Ad vectors for Gene Transfer, 
Advances in Pharmacology, Vol 40 Academic Press. 

In an embodiment of the present invention, the Ad6 recombinant 
nucleic acid is an adenovirus vector defective in El that is able to replicate when El is 
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supplied in trans. Expression cassettes can be inserted into a deleted El region and/or 
a deleted E3 region. 

An example of an Ad6 based adenoviral vector with an expression 
cassette provided in a deleted El region comprises or consists of: 
5 a) a first adenovirus region from about base pair 1 to about base 

pair 450 corresponding to either Ad5 or Ad6; 

b) a gene expression cassette in a El parallel or El anti-parallel 
orientation joined to the first region; 

c) a second adenovirus region from about base pair 3511 to about 
10 base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

5541 corresponding to Ad6, joined to the expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or froiti about base pair 5542 to about base pair 
28 156 corresponding to Ad6, joined to the second region; 

15 e) an optionally present fourth region from about base pair 28 134 

to about base pair 308 17 corresponding to Ad5, or from about base pair 218157 to 
about base pair 30788 corresponding to Ad6, joined to the third region; 

f) a fifth adenovirus region from about base pair 308 1 8 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

20 pair 33784 corresponding to Ad6, wherein the fifth region is joined to the fourth 

region if the fourth region is present, or the fifth is joined to the third region if the " 
fourth region is not present; and 

g) a sixth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

25 pair 35759 corresponding to Ad6, joined to the fifth region; 

wherein at least one Ad6 region is present. 

In different embodiments of the invention, all of the regions are from 
* Ad6; all of the regions expect for the first and second are from Ad6; and 1, 2, 3, or 4 

regions selected from the second, third, fourth, and fifth regions are from Ad6* 
30 An example of an Ad6 based adenoviral vector with an expression 

cassette provided in a deleted E3 region comprises or consists of: 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 of Ad6; 
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b) a second adenovirus region from about base pair 3511 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to the first region; 

c) a third adenovirus region from about base pair 5549 to about 

5 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to the second region; 

d) a gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to the third region; 

e) a fourth adenovirus region from about base pair 308 1 8 to about 
10 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to the gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region; 

15 wherein at least one Ad6 region is present. 

In different embodiment of the invention, all of the regions are from 
Ad6; all of the regions expect for the first and second are from Ad6; and 1, 2, 3, or 4 
regions selected from the second, third, fourth and fifth regions are from Ad6. 

20 IV. VECTOR PRODUCTION 

Vectors can be produced using recombinant nucleic acid techniques 
such as those involving the use of restriction enzymes, nucleic acid ligation, and 
homologous recombination. Recombinant nucleic acid techniques sire well known in 
the art., (Ausubel, Current Protocols in Molecular Biology, John Wiley, 1987-1998, 

25 and Sambrook et ah , Molecular Cloning, A Laboratory Manual* 2 nd Edition, Cold 
Spring Harbor Laboratory Press, 1989.) 

Intermediate vectors are used to derive a therapeutic vector or to 
transfer an expression cassette or portion thereof from one vector to another vector. 
Examples of intermediate vectors include adenovirus genome plasmids and shuttle 

30 vectors. 

Useful elements in an intermediate vector include an origin of 
replication, a selectable marker, homologous recombination regions, and convenient 
restriction sites. Convenient restriction sites can be used to facilitate cloning or release 
of a nucleic acid sequence. 

26 
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Homologous recombination regions provide nucleic acid sequence 
regions that are homologous to a target region in another nucleic acid molecule. The 
homologous regions flank the nucleic acid sequence that is being inserted into the 
target region. In different embodiments homologous regions are preferably about 150 
5 to 600 nucleotides in length, or about 100 to 500 nucleotides in length. 

An embodiment of the present invention describes a shuttle vector 
containing a Met-NS3-NS4A-NS4B-NS5A-NS5B expression cassette, a selectable 
marker, a bacterial origin of replication, a first adenovirus homology region and a 
second adenovirus homologous region that target the expression cassette to insert in 

10 or replace an El region. The first and second homology regions flank the expression 
cassette. The first homology region contains at least about 100 base pairs 
substantially homologous to at least the right end (3' end) of a wild-type adenovirus 
region from about base pairs 4-450. The second homology contains at least about 100 
base pairs substantially homologous to at least the left end (5' end) of Ad5 from about 

15 base pairs 35 1 1-5792, or the corresponding region from another adenovirus. 

Reference to "substantially homologous" indicates a sufficient degree 
of homology to specifically recombine with a target region. In different embodiments 
substantially homologous refers to at least 85%, at least 95%, or 100% sequence 
identity. Sequence identity can be calculated as described in Section LB. supra. 

20 One method of producing adenovectors is through the creation of an 

adenovirus genome plasmid containing an expression cassette. The pre- Adenovirus 
plasmid contains all the adenovirus sequences needed for replication in the desired 
complimenting ceil line. The pre-Adenovirus plasmid is then digested with a 
restriction enzyme to release the viral ITR's and transfected into the complementing 

25 cell line for virus rescue. The ITR's must be released from plasmid sequences to 
allow replication to occur. Adenovector rescue results in the production on an 
adenovector containing the expression cassette. 

A; Adenovirus Genome Plasmids 

30 Adenovirus genome plasmids contain an adenovector sequence inside 

a longer-length plasmid (which may be acosmid). The longer-length plasmid may 
contain additional elements such as those facilitating growth and selection in 
eukaryotic or bacterial cells depending upon the procedures employed tb produce and 
maintain the plasmid. Techniques for producing adenovirus genome plasmids include 

35 those involving the use of shuttle vectors and homologous recombination, and those 
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involving the insertion of a gene expression cassette into an adenovirus cosmid (Hitt 
et aL, Methods in Molecular Genetics 7:13-30, 1995, Danthinne et al, Gene Therapy 
7:1707-1714,2000.) 

Adenovirus genome plasmids preferably have a gene expression 
5 cassette inserted into a E 1 or E3 deleted region. In an embodiment of the present 
invention, the adenovirus genome plasmid contains a gene expression cassette 
inserted in the El deleted region, an origin of replication, a selectable marker, and the 
recombinant adenovirus region is made up of: 

a) a first adenovirus region from about base pair 1 to about base 
10 450 corresponding to either Ad5 or Ad6; 

b) a gene expression cassette in a El parallel or El anti-parallel 
orientation joined to the first region; 

c) a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

15 5541 cprresponding to Ad6, joined to the expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to the second region; 

e) a fourth adenovirus region from about base pair 30818 to about 
20 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to the third region; 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region, and 

25 g) an optionally present E3 region corresponding to all or part of 

the E3 region present in Ad5 or Ad6, which may be present for smaller inserts taking 

into account the overall size of the desired adenovector. 

In another embodiment of the present invention the recombinant 

adenovirus genome plasmid has the gene expression cassette inserted in the E3 
30 deleted region- The vector contains an origin of replication, a selectable marker, and 

the following: 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 
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b) a second adenovirus region from about base pair 35 1 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to the expression cassette; 

c) a third adenovirus region from about base pair 5549 to about 

5 base pair 28 133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to the second region; 

d) the gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to the third region; 

e) a fourth adenovirus region from about base pair 308 1 8 to about 
10 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to the gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region. 

15 In different embodiments concerning adenovirus regions that are 

present: (1) the first, second, third, fourth, and fifth region corresponds to Ad5; (2) the 
first, second, third, fourth, and fifth region corresponds to Ad6; and (3) the first region 
corresponds to Ad5, the second region corresponds to Ad5, the third region 
corresponds to Ad6, the fourth region corresponds to Ad6, and the fifth region 

20 corresponds to Ad5. 

An embodiment of the present invention describes a method of making 
an adeno vector involving a homologous recombination step to produce a adenovirus 
genome plasmid and an adenovirus rescue step. The homologous recombination step 
involves the use of a shuttle Vector containing a Met-NS3-NS4A-NS4B-NS5A-NS5B 

25 expression cassette flanked by adenovirus homology regions. The adenovirus 
homology regions target the expression cassette into either the El or E3 deleted 
region. 

In an embodiment of the present invention concerning the production 
of an adenovirus genome plasmid, the gene expression cassette is inserted into a 

30 vector comprising: a first adenovirus region from about base pair 1 to about base pair 
450 corresponding to either Ad5 or Ad6; a second adenovirus regidn from about base 
pair 35 1 1 to about base pair 5548 corresponding to Ad5 or from about base pair 3508 
to about base pair 5541 corresponding to Ad6, joined to the second region; a third 
adenovirus region from about base pair 5549 to about base pair 28133 corresponding 

35 to Ad5 or from about base pair 5542 to about base pair 28156 corresponding to Ad6, 
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joined to the second region; a fourth adenovirus region from about base pair 30818 to 
about base pair 33966 corresponding to Ad5 or from about base pair 30789 to about 
base pair 33784 corresponding to Ad6, joined to the third region; and a fifth 
adenovirus region from about 33967 to about 35935 corresponding to Ad5 or from 
5 about base pair 33785 to about base pair 35759 corresponding to Ad6, joined to the 
fourth region. The adenovirus genome plasmid should contain an origin of replication 
and a selectable marker, and may contain all or part of the Ad5 or Ad6 E3 region. 

In different embodiments concerning adenovirus regions that are 
present: (1) the first, second, third, fourth, and fifth region corresponds to Ad5; (2) die 
10 first, second, third, fourth, and fifth region corresponds to Ad6; and (3) the first region 
corresponds to Ad5, the second region corresponds to Ad5, the third region 
corresponds to Ad6, the fourth region corresponds to Ad6, and the fifth region 
corresponds to Ad5. 

15 B. Adenovector Rescue 

An adenovector can be rescued from a recombinant adenovirus 
genome plasmid using techniques known in the art or described herein. Examples of 
techniques for adenovirus rescue well known in the ait are provided by Hitt et al, 
Methods in Molecular Genetics 7:13-30, 1995, andDanthinne etcd., Gene Therapy 
20 7:1707-1714,2000. 

A preferred method of rescuing an adenovector described herein 
involves boosting adenoviral replication. Boosting adenoviral replication can be 
performed, for example, by supplying adenoviral functions such as E2 proteins 
(polymerase, pre-terminal protein and DNA binding protein) as well as E4 orf6 on a 
25 separate plasmid. Example 10 infra, illustrates the boosting of adenoviral replication 
to rescue an adenovector containing a codon optimized Met-NS3-NS4A-NS4B- 
" NS5A-NS5B expression cassette. 

V. PARTIAL-OPITIMIZED HCV ENCODING SEQUENCES 
30 Partial optimization of HCV polyprotein encoding nucleic acid 

provides for a lesser amount of codons optimized for expression in a human than 
complete optimization. The overall objective is to provide the benefits of increased 
expression due to codon optimization, while facilitating the production of an 
adenovector containing HCV polyprotein encoding nucleic acid having optimized 
35 codons. 
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Complete optimization of an HCV polyprotein encoding sequence ; ; f r v 
provides the most frequently observed human coddn for each amino acid. Complete 
optimization can be performed using codon frequency tables well known in the art 
and using programs such as the BACKTRANSLATE program (Wisconsin Package 
5 version 10, Genetics Computer Group, GCG, Madison, Wise). 

Partial optimization can be preformed on an entire HCV polyprotein y?* 
encoding sequence that is present (e.g.* NS3-NS5B), or one or more local regions that are 
present. In different embodiments the GC content for the entire HCV encoded polyprotein 
that is present is no greater than at least about 65%; and the GC content for one or more 
10 local regions is no greater than about 70%. 

Local regions are regions present in HCV encoding nucleic acid, and can' 4 
vary in size. For example, local regions can be about 60, about 70, about 80, about 90 or 
about 100 nucleotides in length. v 

Partial optimization can be achieved by initially constructing an HCV ^ 
15 encoding polyprotein sequence to be partially optimized based on a naturally ocuning 

sequence. Alternatively, an optimized HCV encoding sequence can be used as basis of/y 
comparison to produce a partial optimized sequence. 

VI. HCV COMBINATION TREATMENT 

20 The HCV Met-NS3-NS4A-NS4B-NS5 A-NS5B vaccine can be used by 

itself to treat a patient, can be used in conjunction with other HCV therapeutics, and <j X 
can be used with agents targeting other types of diseases. Additional therapeutics 
include additional therapeutic agents to treat HCV and diseases having a high 
prevalence in HCV infected persons. Agents targeting other types of disease include " 

25 vaccines directed against HIV and HB V. 

Additional therapeutics for treating HCV include vaccines and non- 
vaccine agents. (Zein, Expert Opiri. Investig. Drugs 70:1457-1469* 2001.) Examples " 
of additional HCV vaccine^ incl ude vadciries designed to elicit an immune response * 
against an HCV cdre antigen and the HCV El, E2 or p7 region. Vaccine components f V 

30 can be naturally occurring HCV polypeptides, HCV mimotope polypeptides or nucleic 

acid encoding such polypeptides; ^ > ;; 

HCV mimotope polypeptides contain HCV epitopes, but have a * 
different sequence than a naturally occurring HCV antigen, A HCV mimotope can be 
fused to a naturally occurring HCV antigen. References describing techniques for 

35 producing mimotopes in general and describing different HCV mimotopes are 
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provided in Felici et al. U.S. Patent No. 5,994,083 and Nicosia et al t International 
Application Number WO 99/60132. 

W. PHARMACEUTIC AL ADMINISTRATION 
5 HCV vaccines can be formulated and administered to a patient using 

the guidance provided herein along with techniques well known in the art. Guidelines 
for pharmaceutical administration in general are provided in, for example, Modern 
Vdccinology, Ed. Kurstak, Plenum Med. Co. 1994; Remington's Pharmaceutical 
Sciences 18? h Edition, Ed. Gennaro, Mack Publishing, 1990; and Modern 
10 Pharmaceutics 2 nd Edition, Eds. Banker and Rhodes, Marcel Dekker, Inc., 1990, each 
of which are hereby incorporated by reference herein. 

HCV vaccines can be administered by different routes such 
intravenous, intraperitoneal, subcutaneous, intramuscular, intradermal, impression 
through the skin, or nasal. A preferred route is intramuscular. 
15 Intramuscular administration can be preformed using different 

techniques such as by injection with or without one or more electric pulses. Electric 
mediated transfer can assist genetic immunization by stimulating both humoral and 
cellular immune responses. 

Vaccine injection can be performed using different techniques, such as 
20 by employing a needle or a needless injection system. An example of a needless 

injection system is a jet injection device. (Donnelly et aL, International Publication 
Number WO 99/52463.) 

A, Electrically Mediated Transfer 
25 Electrically mediated transfer or Gene Electro-Transfer (GET) can be 

performed by delivering suitable electric pulses after nucleic acid injection. (See 

Mathiesen, International Publication Number WO 98/43702). Plasmid injection and 
'* electroporation can be performed using stainless needles. Needles can be used in 

couples, triplets or more complex patterns. In one configuration the needles are 
30 soldered on a printed circuit board that is a mechanical support and connects the 

needles to the electrical field generator by means of suitable cables. 

The electrical stimulus is given in the form of electrical pulses. Pulses 

can be of different forms (square, sinusoidal, triangular, exponential decay) and 

different polarity (monopolar of positive or negative polarity, bipolar). Pulses can be 
35 delivered either at constant voltage or constant current modality. 
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Different patterns of electric treatment can be used to introduce nucleic 
acid vaccines including HCV and other nucleic acid vaccines into a patient Possible 
patterns of electric treatment include the following: 

Treatment 1: 10 trains of 1000 square bipolar pulses delivered every 
5 other second, pulse length 0.2 msec/phase, frequency 1000 Hz, constant voltage 
mode, 45 Volts/phase, floating current. 

Treatment 2: 2 trains of 100 square bipolar pulses delivered every other 
second, pulse length 2 msec/phase, frequency 100 Hz, constant current mode, 100 
mA/phase, floating voltage. 
10 Treatment 3: 2 trains of bipolar pulses at a pulse length of about 2 

msec/phase, for a total length of about 3 seconds, where the actual current going 
through the tissue is fixed at about 50 mA. 

Electric pulses are delivered through an electric field generator. A 
suitable generator can be composed of three independent hardware elements 
15 assembled in a common chassis and driven by a portable PC which runs the driving 
program. The software manages both basic and accessory functions. The elements of 
the device are: (1) signal generator driven by a microprocessor, (2) power amplifier 
arid (3) digital oscilloscope. 

The signal generator delivers signals having arbitrary frequency and 
20 shape in a given range under software control: The same software has an interactive 
editor for the waveform to be delivered. The generator features a digitally controlled 
current limiting device (a safety feature to control the maximal current output). The 
power amplifier can amplify the signal generated up to +/- 150 Y. The oscilloscope is 
digital and is able to sample both the voltage and the current being delivered by the 
25 amplifier. 

B. Pharmaceutical Carriers 

Pharmaceutically acceptable carriers facilitate storage and 
administration of a vaccine to a subject. Examples of pharmaceutically acceptable 
30 earners are described herein. Additional pharmaceutical acceptable carriers are well 
known in the art. 

Pharmaceutically acceptable carriers may contain different components 
such a buffer, normal saline or phosphate buffered saline* sucrose, salts and 
pdlysorbate. An example of a pharmaceutically acceptable carrier is follows: 2.5-10 
35 rnM TRIS buffer, preferably about 5 mM TRIS buffer; 25-100 riiM NaCl, preferably 
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about 75 mM NaCl; 2.5-10% sucrose, preferably about 5% sucrose; 0.01 -2 mM 
MgCl 2 ; and 0.001%-0.01% polysorbate 80 (plant derived). The pH is preferably from 
about 7.0-9,0, more preferably about 8.0. A specific example of a carrier contains 5 
mM TRIS, 75 mM NaCl, 5% sucrose, 1 mM MgCl 2 , 0.005% polysorbate 80 at pH 
5 8.0. 

C. Dosing Regimes 

Suitable dosing regimens can be determined taking into account the 
efficacy of a particular vaccine and factors such as age, weight, sex and medical 

10 condition of a patient; the route of administration; the desired effect; and the number 
of doses. The efficacy of a particular vaccine depends on different factors such as the 
ability of a particular vaccine to produce polypeptide that is expressed and processed 
in a cell and presented in the context of MHC class I and II complexes. 

HCV encoding nucleic acid administered to a patient can be part of 

15 different types of vectors including viral vectors such as adenovector, and DNA 
plasmid vaccines. In different embodiments concerning administration of a DNA 
plasmid, about 0.1 to 10 mg of plasmid is administered to a patient, and about 1 to 5 
mg of plasmid is administered to a patient. In different embodiments concerning 
administration of a viral vector, preferably an adenoviral vector, about 105 to 10l 1 

20 viral particles are administered to a patient, and about 107 to 10*0 viral particles are 
administered to a patient. 

Viral vector vaccines and DNA plasmid vaccines may be administered 
alone, or may be part of a prime and boost administration regimen. A mixed modality 
priming and booster inoculation involves either priming with a DNA vaccine and 

25 boosting with viral vector vaccine, or priming with a viral vector vaccine and boosting 
with a DNA vaccine. 

Multiple priming, for example, about to 2-4 or more may be used. The 

** length of time between priming and boost may typically vary from about four months 
to a year, but other time frames may be used. The use of a priming regimen with a 

30 DNA vaccine may be preferred in situations where a person has a pre-existing anti- 
adeno virus immune response. 

Li an embodiment of the present invention, IxlO 7 to lxlO 12 particles 
and preferably about lxlO 10 to lxlO 11 particles of adenovector is administered directly 
into muscle tissue. Following initial vaccination a boost is performed with an 

35 adenovector or DNA vaccine. 
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In another embodiment of the present invention initial vaccination is 
performed with a DNA vaccine directly into muscle tissue. Following initial 
vaccination a boost is performed with an adeno vector or DNA vaccine. 

Agents such as interleukin-12, GM-CSF, B7-1, B7-2, IP10, Mig-1 can 
5 be coadministered to boost the immune response. The agents can be coadministered 
as proteins or through use of nucleic acid vectors. 

D. Heterologous Prime-Boost 

Heterologous prime-boost is a mixed modality involving the use of one 

10 type of viral vector for priming and another type of viral vector for boosting. The 
heterologous prime-boost can involve related vectors such as vectors based on 
different adenovirus serotypes and more distantly related viruses such adenovirus and 
poxvirus. The use of poxvirus and adenovirus vectors to protect mice against malaria 
is illustrated by Gilbert et a/., Vaccine 20:1039-1045, 2002. 

15 Different embodiments concerning priming and boosting involve the 

following types of vectors expressing desired antigens such as Met-NS3-NS4A- 
NS4B-NS5A-NS5B: Ad5 vector followed by Ad6 vector; Ad6 vector followed by 
Ad5 vector; Ad5 vector followed by poxvirus vector; poxvirus vector followed by 
Ad5 vector; Ad6 vector followed by poxvirus vector; and poxvirus vector followed by 

20 Ad6 vector. / 
The length of time between priming and boosting typically varies from 
about four months to a year, but other time frames may be used. The minimum time 
frame should be sufficient to allow for an immunological rest. In an embodiment, this 
rest is for a period of at least 6 months. Priming may involve multiple priming with 

25 one type of vector, such as 2-4 primings. 

Expression cassettes present in a poxvirus vector should contain a 

promoter either native to, or derived from, the poxvirus of interest or another poxvirus 

...... 

member. Different strategies for constructing and employing different types of 
poxvirus based vectors including those based on vaccinia virus, modified vaccinia 

30 virus, avipox virus, raccoon poxvirus, modified vaccinia virus Ankara, 

canarypoxviruses (such as ALVAC), fowlpoxviruses, cowpoxviruses, and NYV AC 
are well known in the art. (Moss* Current Topics in Microbiology and Immunology 
755:25-38, 1982; Earl et a/., In Current Protocols in Molecular Biology, Ausubel et 
ah eds., New York: Greene Publishing Associates & Wiley Interseience; 

35 1991:16.16.1-16.16.7, Child et aL Virology 174(2):625-9, 1990; Tartaglia et a/., 
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Virology 188:217-232, 1992; U.S. Patent Nos., 4,603,112, 4,722,848, 4,769,330, 
5,110,587, 5,174,993, 5,185,146, 5,266,313, 5,505,941, 5,863,542, and 5,942,235. 

E. Adjuvants 

5 HCV vaccines can be formulated with an adjuvant Adjuvants are 

particularly useful for DNA plasmid vaccines. Examples of adjuvants are alum, 
AIPO4, alhydrogel, Lipid-A and derivatives or variants thereof, Freund's incomplete 

adjuvant, neutral liposomes, liposomes containing the vaccine and cytokines, non- 
ionic block copolymers, and chemokines. 

10 Non-ionic block polymers containing polyoxyethylene (POE) and 

polyxylpropylene (POP), such as POE-POP-POE block copolymers may be used as an 
adjuvant. (Newman et al, Critical Reviews in Therapeutic Drug Carrier Systems 
75:89-142* 1998.) The immune response of a nucleic acid can be enhanced using a 
non-ionic block copolymer combined with an anionic surfactant. 

15 A specific example of an adjuvant formulation is one containing CRL- 

1005 (CytRx Research Laboratories), DNA, and benzylalkonium chloride (BAK). 
The formulation can be prepared by adding pure polymer to a cold (< 5°G) solution of 
plasmid DNA in PBS using a positive displacement pipette. The solution is then 
vortexed to solubilize the polymer. After complete solubilization of the polymer a 

20 clear solution is obtained at temperatures below the cloud point of the polymer (~6- 
7°C). Approximately 4 mM BAK is then added to the DNA/CRL-1005 solution in 
PBS, by slow addition of a dilute solution of BAK dissolved in PBS. The initial DNA 
concentration is approximately 6 mg/mL before the addition of polymer and BAK, 
and the final DNA concentration is about 5 mg/mL. After BAK addition the 

25 formulation is vortexed extensively, while the temperature is allowed to increase from 
~ 2°C to above the cloud point The formulation is then placed on ice to decrease the 
temperature below the cloud point. Then, the formulation is vortexed while the 
temperature is allowed to increase from ~2°C to above the cloud point. Cooling and 
mixing while the temperature is allowed to increase from ~2°C to above the cloud 

30 point is repeated several times, until the particle size of the formulation is about 200- 
500 nm, as measured by dynamic light scattering. The formulation is then stored on 
ice until the solution is clear, then placed in storage at -70°C. Before use, the 
formulation is allowed to thaw at room temperature, 

35 
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F. Vaccine Storage 

Adenovector and DNA vaccines can be stored using different types of 
buffers. For example, buffer A105 described in Example 9 infra, can be used to for 
vector storage. 

5 Storage of DNA can be enhanced by removal or chelation of trace 

metal ions. Reagents such as succinic or malic acid, and chelators can be used to 
enhance DNA vaccine stability. Examples of chelators include multiple phosphate 
ligands and EDTA. The inclusion of non-reducing free radical scavengers, such as 
ethariol or glycerol* can also be useful to prevent damage of DNA plasmid from free 
10 radical production. Furthermore, the buffer type, pH, salt concentration, light 

exposure, as well as the type of sterilization process used to prepare the vials* may be 
controlled iii the formulation to optimize the stability of the DNA vaccine. 

Vn. EXAMPLES 

15 Examples are provided below to further illustrate different features of 

the present invention. The examples also illustrate useful methodology for practicing 
the invention. These examples do not limit the claimed invention. 

Example 1: Met-NS3-NS4A-NS4B-NS5A-NS5B Expression Cassettes 

20 Different gene expression cassettes encoding HCV NS3-NS4A-NS4B- 

NS5A-NS5B were constructed based on a lb subtype HCV BK strain. The encoded 
sequences had either (1) an active NS5B sequence ("NS"), (2) an inactive NS5B 
sequence ("NSmut"), (3) a codon optimized sequence with an inactive NS5B 
sequence ("NSOPTmut"). The expression cassettes also contained a CMV 

25 promoter/enhancer and the BGH polyadenylation signal. 

The NS nucleotide sequence (SEQ. ID. NO, 5) differs from HCV BK 
strain GenBank accession number M58335 by 30 out of 5952 nucleotides; The NS 
amino acid sequence (SEQ. ID; NO, 6) differs from the corresponding lb genotype 
HCV BK strain by 7 out of 1984 amino acids. To allow for initiation of translation an 

30 ATG codon is present at the 5' end of the NS sequence. A TGA termination sequence 
is present at the 3' end of the NS sequence. 

The NSmut nucleotide sequence (SEQ. ID. NO. 2, Figure 2), is similar 
to the NS sequence. The; differences between NSmut arid NS include NSmut having 
an altered NS5B catalytic site; an optirrial ribosome binding site at the 5* end; and a 

35 TAAA termination sequence at the 3* end. The alterations in NS5B comprise bases 
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5138 to 5146, which encode amino acids 1711 to 1713. The alterations result in a 
change of amino acids GlyAspAsp into AlaAlaGly and creates an inactive form of the 
NS5B RNA-dependent RNA-polymerase NS5B. 

The NSOPTmut sequence (SEQ. ID. NO. 3, Figure 3) was designed 
5 based on the amino acid sequence encoded by NSmut. The NSmut amino acid 
sequence was back translated into a nucleotide sequence with the GCG (Wisconsin 
Package version 10, Genetics Computer Group, GCG, Madison, Wise.) 
B ACKTR ANSLATE program. To generate a NSOPTmut nucleotide sequence where 
each amino acid is coded for by the corresponding most frequently observed human 
10 codon, the program was run choosing as parameter the generation of the most 
probable nucleotide sequence and specifying the codon frequency table of highly 
expressed human genes (human_high.cod) available within the GCG Package as 
translation scheme. 

15 Example 2: Generation pVUns plasmid with NS, NSmut or NSOPTmut Sequences 
pVl Jns plasmids containing either the NS sequence, NSmut sequence 
or NSOPTmut sequences were generated and characterised as follows: 

pVUns Plasmid with the NS Sequence 

20 The coding region Met-NS3-NS4A-NS4B-NS5 A and the coding 

region Met-NS3-NS4A-NS4B-NS5 A-NS5B from a HCV BK type strain (Tomei et 
al t J, Virol 67:4017-4026, 1993) were cloned into pcDNA3 plasmid (Invitrogen), 
generating pcD3-5a and pcD3-5b vectors, respectively. PcD3-5A was digested with 
Hind HI, blunt-ended with Klenow fill-in and subsequently digested with Xba I, to 

25 generate a fragment corresponding to the coding region of Met-NS3-NS4A-NS4B- 
NS5 A. The fragment was cloned into pVl Jns-poly, digested with Bgl II blunt-ended 
with Klenow fill-in and subsequently digested with Xba I, generating pVl JnsNS3-5A. 

A pVUns-poly is a derivative of pVl JnsA plasmid (Montgomery et al., 

DNA and Cell Biol 12:111-1%3, 1993), modified by insertion of a polylinker 

30 containing recognition sites for Xbal, Pmel, Pad into the unique Bgin and NotI 
restriction sites. The pVUns plasmid with the NS sequence (pVUnsNS3-5B) was 
obtained by homologous recombination into the bacterial strain BJ5 183, co- 
transforming pVlJNS3-5A linearized with Xbal and NotI digestion and a PCR 
fragment containing approximately 200 bp of NS5A, NS5B coding sequence and 
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approximately 60 bp of the BGH polyadenylation signal. The resulting plasmid 
represents pVUns^NS. ( 

pVUns-NS can be summarized as follows: 
Bases 1 to 1881 of pVUnsA 

5 an additional AGCTT 

then the Met-NS3-NS5B sequence (SEQ. ID. NO, 5) 

then the wt TGA stop 

an additional TCTAGAGCGTTTAAACCCTTAATTAAGG (SEQ. ED. 

NO. 14) 

10 Bases 1912 to 4909 of pVUnsA 

pVUns Plasmid with the NSmut Sequence 

The VlJnsNS3-5A plasmid was modified at the 5' of the NS3 coding 
sequence by addition of a full Kozak sequence. The plasmid (VlJNS3-5Akozak) was 

15 obtained by homologous recombination into the bacterial strain BJ5183, co- 
transforming VI JNS3-5A linearized by AflSL digestion and a PGR fragment containing 
the proximal part of Intron A, the restriction site Bgill, a full Kozak translation 
initiation sequence and part of the NS3 coding sequence. 

The resulting plasmid (VlJNS3-5Akozak) was linearized with Xba I 

20 digestion and co-transformed into the bacterial strain BJ5183 with a PCR fragment; 
containing approximately 200 bp of NS5A, the NS5B mutated sequence, the strong 
translation termination TAAA and approximately 60 bp of the BGH polyadenylation 
signal. The PCR fragment was obtained by assembling two 22bp-*>verlapping 
fragments where mutations were introduced by the oligonucleotides used for their 

25 amplification. The resulting plasmid represents pVUns-NSmut. 

pVUns-NSmut can be summarized as follows: 
Bases 1 to 1882 of pVUnsA 

* then the kozak Met-NS3-NS5B(mut) TAAA sequence (SEQ. ED. NO. 2) 
an additional TCTAGA 

30 Bases 1925 to 4909 of pVUnsA 

pVUns Plasmid with the NSOPTmut Sequence 

The human codon-optimized synthetic gene (NSOPTmut) with 
mutated NS5B to abrogate enzymatic activity, full Kozak translation initiation 
35 sequence and a strong translation termination was digested with BamHI and Sail , 
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restriction sites present at the 5* and 3' end of the gene. The gene was then cloned 
into the BglE and Sail restriction sites present in the polylinker of pVUnsA plasmid, 
generating pVlJns-NSOPTmut. 

pVlJns-NSOPTmut can be summarized as follows: 

5 Bases 1 to 1881 of pVl JnsA 

an additional C 

then kozak Met-NS3-NS5B(optmut) TAAA sequence (SEQ. ID. NO. 3) 

an additional TTTAAATGTTTAAAC (SEQ. ID. NO. 15) 
Bases 1905 to 4909 of pVl JnsA 

10 

Plasmids Characterization 

Expression of HCV NS proteins was tested by transfection of HEK 
293 cells, grown in 10% FCS/DMEM supplemented by L-glutamine (final 4 mM). 
Twenty-four hours before transfection, cells were plated in 6-well 35 mm diameter, to 

15 reach 90-95% confluence on the day of transfection. Forty nanograms of plasmid 

DNA (previously assessed as a non-saturating DNA amount) were co-transfected with 
100 ng of pRSV-Luc plasmid containing the luciferase reporter gene under the control 
of Rous sarcoma virus promoter, using the LIPOFECTAMINE 2000 reagent. Cells 
were kept in a CO2 incubator for 48 hours at 37 °C. 

20 Cell extracts were prepared in 1% Triton/TEN buffer. The extracts 

were normalized for Luciferase activity, and run in serial dilution on 10% SDS- 
acrylamide gel. Proteins were transferred on nitrocellulose and assayed with 
antibodies directed against NS3, NS5A and NS5B to assess strength of expression and 
correct proteolytic cleavage. Mock-transfected cells were used as a negative control. 

25 Results from representative experiments testing pVUnsNS, pVUnsNSmut and 
pVUnsNSOPTmut are shown in Figure 12. 

Example 3: Mice Immunization with Plasmid DNA Vectors 

The DNA plasmids pVUns-NS, pVl Jns-NSmut and pVUns- 
30 NSOPTmut were injected in different mice strains to evaluate their potential to elicit 
anti-HCV immune responses. Two different strains (Balb/C and C57Black6, N=9-10) 
were injected intramuscularly with 25 or 50 jig of DNA followed by electrical pluses. 
Each animal received two doses at three weeks interval. 

Humoral immune response elicited in C57Black6 mice against the NS3 
35 protein was measured in post dose two sera by ELISA on bacterially expressed NS3 
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protease domain. Antibodies specific for the tested antigen were detected in animals 
immunized with all three vectors with geometric mean titers (GMT) ranging from 
94000 to 133000 (Tables 1-3). 

5 Table 1: pVlins-NS 





GMT 


Mice 


1 


2 


3 


4 


5 


6 


7 


8 . 


9 




Titer 


105466 


891980 


78799 


39496 


543542 


182139 


32351 


9S028 


67800 


94553 



Table 2: pVljns-NSmut 





GMT 


Mice 
n. 


11 


12 


13 


14 


15 


16 


17 


18 


19 


20 




Titer 


202981 


55670 


130786 


49748 


17672 


174958 


44304 


37337 


78182 


193695 


75083 



Table 3: pVlins-NSOPTmut 





GMT 


Mice 
n. 


21 


22 


23 


24 


25 


26 


27 


28 


29 


30 




Titer 


310349 


43645 


63496 


82174 


630778 


297259 


66861 


146735 


173506 


77732 


133165 



15 ;y- 

A T cell response was measured in C57Black6 mice immunized with 

two intramuscular injections at three weeks interval with 25 fig of plasmid DNA. 
^ Quantitative ELIspot assay was performed to determine the number of EFNy secreting 

T cells in response to five pools of 20mer peptides overlapping by ten residues 
20 encompassing the NS3-NS5B sequence. Specific CD8+ response was analyzed by the j 

same assay using a 20mer peptide encompassing a CD8+ epitope for C57Black6 mice \ 

(pepl480). 

Cells secreting IFNy in an antigen specific-manner were detected using 
a standard ELIspot assay. T cell response in C57Black6 mice immunized with two 
25 intramuscular injections at three weeks interval with 50 jig of plasmid DNA, was 
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analyzed by the same ELIspot assay measuring the number of IFNy secreting T cells 
in response to five pools of 20mer peptides overlapping by ten residues encompassing 

the NS3-NS5B sequence. 

Spleen cells were prepared from immunized mice and re-suspended in 

5 R IO medium (RPMI 1640 supplemented with 10% FCS, 2 mM L-Glutamine, 50 
U/ml-50fig/ml Penicillin/Streptomycin, 10 mM Hepes, 50 fiM 2-mercapto-ethanol). 
Multiscreen 96-well Filtration Plates (Millipore, Cat. No. MAIPS4510, Millipore 
Corporation, 80 Ashby Road Bedford, MA) were coated with purified rat anti-mouse 
INFy antibody (PharMingen, Cat. No. 18 18 ID, PharmiMingen, 10975 Torreyana 

10 Road, San Diego, California 92121-1111 USA). After overnight incubation, plates 
were washed with PBS lX/0.005% Tween and blocked with 250 ptl/well of R10 
medium. 

Splenocytes from immunized mice were prepared and incubated for 
twenty-four hours in the presence or absence of 10 ^M peptide at a density of 2.5 X 

15 lOVwell or 5 X 10 5 /well. After extensive washing (PBS 1X70.005% Tween), 
biotinylated rat anti-mouse IFNy antibody (PharMingen, Cat. No. 18112D, 
PharMingen, 10975 Torreyana Road, San Diego, California 92121-1 1 11 USA) was 
added and incubated overnight at 4° C. For development, streptavidin-AKP 
(PharMingen, Cat. No. 13043E, PharMingen, 10975 Torreyana Road, San Diego, 

20 California 92121-1111 USA) and 1-Step™ NBT-BCEP development solution (Pierce, 
Cat. No. 34042, Pierce, P.O. Box 1 17, Rockford, IL 61 105 USA) were added. 

Pools of 20mer overlapping peptides encompassing the entire sequence 
of the HCV BK strain NS3 to NS5B were used to reveal HCV-specific EFNY-secreting 
T cells. Similarly a single 20mer peptide encompassing a CD8+ epitope for 

25 C57Black6 mice was used to detect CD8 response. Representative data from groups 
of C57Black6 and Balb/C mice (N=9-10) immunized with two injections of 25 or 50 
jig of plasmid vectors pVl Jns-NS, pVl Jns-NSmut and pVl Jns-NSOPTmut are shown 
in Figures 13A and 13B. 

30 Example 4: Immunization of Rhesus Macaques 

Rhesus macaques (N=3) were immunized by intramuscular injection 
with 5mg of plasmid pVl Jns-NSOPTmut in 7.5mg/mi CRL1005, Benzalkonium 
chloride 0.6 mM. Each animal received two doses in the deltoid muscle at 0, and 4 
weeks. 
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CMI was measured at different time points by IFN-y ELTSPOT. This 
assay measures HCV antigen-specific CD8+ and CD4+ T lymphocyte responses, and 
can be used for a variety of mammals, such as humans, rhesus monkeys, mice, and 
rats. 

5 The use of a specific peptide or a pool of peptides can simplify antigen 

presentation in CTL cytotoxicity assays, interferon-gamma ELJSPOT assays and 
interferon-gamma intracellular staining assays. Peptides based on the amino acid 
sequence of various HCV proteins (core, E2, NS3, NS4A, NS4B, NS5A, NS5B) Were 
prepared for use in these assays to measure immune responses in HCV DNA and 
10 adenovirus vector vaccinated rhesus monkeys, as well as in HCV-infected humans. 
The individual peptides are overlapping 20-mers, offset by 10 amino acids, Large 
pools of peptides can be used to detect an overall response to HCV proteins while 
smaller pools and individual peptides may be used to define the epitope specificity of 
a response. 

15 1 - 

IFNy ELISPOT 

The EFNy-ELISPOT assay provides a quantitative determination of 
HCV-specific T lymphocyte responses. PBMC are serially diluted and placed in 
microplate wells coated with anti-rhesus JFN-y antibody (MD-1 U-Cytech). They are 

20 cultured with a HCV peptide pool for 20 hours, resulting in the restimulation of the 
precursor cells and secretion of IFN-Y- The cells are washed away, leaving the 
secreted IFN bound to the antibody-coated wells in concentrated areas where the cells 
were sitting. The captured IFN is detected with biotinylated anti-rhesus IFN aritibody 
(detector Ab U-Cyteeh) followed by alkaline phosphatase-conjugated streptavidin 

25 (Pharmingen 13043E). The addition of insoluble alkaline phosphatase substrate 

results in dark spots in the wells at the sites where the cells were located, leaving one 
spot for each T cell that secreted IFN-y. 

The number of spots per well is directly related to the precursor 
frequency of antigen-specific T cells. Gamma interferon was selected as the cytokine 

30 visualized in this assay (using species specific anti-gamma interferon monoclonal 
antibodies) because it is the most common, and one of the most abundant cytokines 
synthesized and secreted by activated T lymphocytes. For this assay, the number of 
spot forming cells (SFC) per million PBMCs is determined for samples in the 
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presence and absence (media control) of peptide antigens. Data from Rhesus 
macaques on PBMC from post dose two material are shown in Table 4. 



Table 4 





PVU-NSOPTmut 


Pep pools 


21G 


99C161 


99C166 


F(NS3p) 


8 


10 


170 


G(NS3h) 


7 


592 


229 


H(NS4) 


3 


14 


16 


I (NS5a) 


5 


71 


36 


L (NS5b) 


14 


23 


11 


M(NS5b) 


3 


35 


8 


DMSO 


2 


4 


5 



INFyELISPOT on PBMC from Rhesus monkeys immunized with two injections of 5 
5 mgDNA/doseinOPTIVAX/BAKofplasmid pVlJns-NSOPTmut Data are 



expressed as SFC7 106 PBMC. 

Example 5: Construction of Ad6 Pre-Adenovirus Plasmids 

Ad6 pre-adenovirus plasmids were obtained as follows: 
10 

Construction of pAd6 El -E3+ Pre-adenovirus Plasmid 

An Ad6 based pre-adenovirus plasmid which can be used to generate 
first generation Ad6 vectors was constructed either taking advantage of the extensive 
sequence identity (approx. 98%) between Ad5 and Ad6 or containing only Ad6 

15 regions. Homologous recombination was used to clone wtAd6 sequences into a 
bacterial plasmid. 

A general strategy used to recover pAd6El-E3+ as a bacterial plasmid 
containing Ad5 and Ad6 regions is illustrated in Figure 10. Cotransformation of BJ 
5 183 bacteria with purified wt Ad6 viral DNA and a second DNA fragment termed 

20 the Ad5 ITR cassette resulted in the circularization of the viral genome by 

homologous recombination. The ITR cassette contains sequences from the right (bp 
33798 to 35935) and left (bp 1 to 341 and bp 3525 to 5767) end of the Ad5 genome 
separated by plasmid sequences containing a bacterial origin of replication and an 
ampicillin resistance gene. The ITR cassette contains a deletion of El sequences from 
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Ad5 342 to 3524. The Ad5 sequences in the TTR cassette provide regions of 
homology with the purified Ad6 viral DNA in which recombination can occur. 

Potential clones were screened by restriction analysis and one clone 
was selected as pAd6El-E3+. This clone was then sequenced in it entirety. pAd6El- 
5 E3+ contains Ad5 sequences from bp 1 to 341 and from bp 3525 to 5548, Ad6 bp 
5542 to 33784, and Ad5 bp 33967 to 35935 (bp numbers refer to the wt sequence for 
both Ad5 and Ad6). pAd6El-E3+ contains the coding sequences for all Ad6 virion ; 
structural proteins which constitute its serotype specificity. 

A general strategy used to recover pAd6El-E3+ as a bacterial plasmid 

10 containing Ad6 regions is illustrated in Figure 1 1. Cotransformation of BJ 5 183 

bacteria with purified wt Ad6 viral DNA and a second DNA fragment termed the Ad6 
ITR cassette resulted in the circularization of the viral genome by homologous 
recombination. The ITR cassette contains sequences from the right (bp 35460 to 
35759) and left (bp 1 to 450 and bp 3508 to 3807) end of the Ad6 genome separated 

15 by plasmid sequences containing a bacterial origin of replication and an ampicillin 
resistance gene. These three segments were generated by PGR and cloned 
sequentially into pNEB193, generating pNEBAd6-3 (the ITR cassette). The ITR 
cassette contains a deletion of El sequences from Ad5 451 to 3507. The Ad6 
sequences in the ITR cassette provide regions of homology with the purified Ad6 viral 

20 DNA in which recombination can occur. 

Construction ofpAd6 E1-E3- pre-adenovirus plasmids 

Ad6 based vectors containing A5 regions and deleted in the E3 region 

were constructed starting with pAd6El-E3+ containing Ad5 regions. A 5322 bp 
25 subfragment of pAd6El-E3+ containing the E3 region (Ad6 bp 25871 t6 31 192) was 

subcldned into pABS.3 generating pABS Ad6E3. Three E3 deletions were then made 

in this plasmid generating three new plasmids pABS Ad6E3(1.8Kb) (deleted for Ad6 
* bp 28602 to 30440), pABSAd6E3(2.3Kb) (deleted for Ad6 bp 28157 to 30437) and 

pABSAd6E3(2.6Kb) (deleted for Ad6 bp 28157 to 30788). Bacterial recombination . 
30 was then used to substitute the three E3 deletions back into pAd6El-E3+ generating 

the Ad6 genome plasmids pAd6El-E3-1.8Kb, pAd6El-E3-2.3Kb and pAd6El-E3- 

2.6Kb. 
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Example 6: Generation of Ad5 Genome Plasmid with the NS Sequence 

A pcDNA3 plasmid (Invitrogen) containing the coding region NS3-NS4A- 
NS4B-NS5A was digested with Xmnl and Nrul restriction sites and the DNA fragment 
containing the CMV promoter, the NS3-NS4A-NS4B-NS5A coding sequence and the 
Bovine Growth Hormone (BGH) polyadenylation signal was cloned into the unique EcorV 
5 restriction site of the shuttle vector pDelElSpa, generating the Sva3-5A vector. 

A pcDNA3 plasmid containing the coding region NS3-NS4A-NS4B- 
NS5A-NS5B was digested with Xmnl and Ecorl (partial digestion), and the DNA 
fragment containing part of NS5A, NS5B gene and the BGH polyadenylation signal 
was cloned into the Sva3-5A vector, digested Ecorl and BglK blunted with Klenow, 
10 generating the Sva3-5B vector. 

The Sva3-5B vector was finally digested Sspl and BstllOlI restriction 
sites and the DNA fragment containing the expression cassette (CMV promoter, NS3- 
NS4A-NS4B-NS5A-NS5B coding sequence and the BGH polyadenylation signal) 
flanked by adenovirus sequences was co-transformed with pAdSHVO (E1-33-) Clal 
15 linearized genome plasmid into the bacterial strain BJ5183, to generate pAdSHVONS. 
pAdSHVO contains Ad5 bp 1 to 341, bp 3525 to 28133 and bp 30818 to 35935. 

Example 7: Generation of Adenovirus Genome Plasmids with the NSmut Sequence 

Adenovirus genome plasmids containing an NS-mut sequence were 
generated in an Ad5 or Ad6 background. The Ad6 background contained Ad5 regions 
20 at bases 1 to 450, 35 1 1 to 5548 and 33967 to 35935. 

pVl JNS3-5 Akozak was digested with BgM and Xbal restriction 
enzymes and the DNA fragment containing the Kozak sequence and the sequence 
coding NS3-NS4A-NS4B-NS5A was cloned into a Bgia and Xbal digested 
polypMRKpdelEl shuttle vector. The resulting vector was designated shNS3- 
25 5 Akozak. 

A PolypMRKpdelEl is a derivative of RKpdelEl(Pac/pEX/pack450) + 

CMVmin+BGHpA(str.) modified by the insertion of a polylinker containing 
recognition sites for Bgin, Pmel, Swal, Xbal, Sail, into the unique Bgin restriction 
site present downstream the CMV promoter. MRKpdelEl(Pac/pDC/pack450) + 

30 CMVmin + BGHpA(str.) contains Ad5 sequences from bp 1 to 5792 with a deletion 
of El sequences from bp 451 to 3510. The human CMV promoter and BGH 
polyadenylation signal were inserted into the El deletion in an El parallel orientation 
with a unique Bgin site separating them. 
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The NS5B fragment, mutated to abrogate enzymatic activity and with a 
strong translation termination at the 3* end, was obtained by assembly PGR and 
inserted into the shNS3-5 Akozak vector via homologous recombinationi generating 
polypMRKpdelElNSmut. In polypMRKpdelElNSmut the NS-mut coding sequence 
5 is under the control of CMV promoter and the BGH polyadenylation signal is present 
downstream. 

The gene expression cassette and the flanking regions which contain 
adenovirus sequences allowing homologous recombination were excised by digestion 
with Pad arid Bstl 1071 restriction enzymes and co-transformed with either 
10 pAdSHVO (E1-JE3-) or pAd6El-E3-2.6Kb Clal linearized genorrie plasmids into the 
bacterial strain BJ5183, to generate pAdSHVONSmut and pAd6El-,E3-NSmut, 
respectively. 

pAd6El-E3-2.6Kb contains Ad5 bp 1 to 341 arid from bp 3525 to 
5548, Ad6 bp 5542 to 28157 and from bp 30788 to 33784, and Ad5 bp 33967 to 
15 35935 (bp numbers refer to the wt sequence for both Ad5 and Ad6). In both plasmids 
the viral ITR's are joined by plasmid sequences that contain the bacterial origin of 
replication and an ampicillin resistance gene. 

Example 8: Generation of Adenovirus Genome Plasmids with the NSOPTrilut 

The human codon-optimized synthetic gene (NSOPTmut) provided by 

20 SEQ. ID. NO. 3 cloned into a pCRBlunt vector (Invitrogen) was digested with BdmHl 
and Sail restriction enzymes and cloned into BglU and Sail restriction sites present in 
the shuttle vector poIypMRKpdelEl. The resulting clone 
(polypMRKpdelElNSOPTmut) was digested with Pad and Bstl 1071 restriction 
enzyriies arid co-transformed with either pAdSHVO (E1-,E3-) or pAd6El-E3-2.6Kb 

25 Ctal linearized genome plasmids, into the bacterial strain BJ5 183, to generate 
pAd5HVONSOPTmut arid pAd6El-,E3-NSOPTmut, respectively. 

Example 9: Rescue and Amplification of Adenovirus Vectors 

Adenovectors were rescued in Per.6 cells. Per.C6 were growri in 10% 
30 FCS / DMEM supplemented by L-glutamine (final 4mM), penicillin/streptomyciri ' 
(final 100 IU/ml) and 10 mM MgGl2- After infection, cells were kept in the same 

medium supplemented by 5% horse serum (HS). For viral rescue, 2.5 X 10 6 Per.C6 
were plated in 6 cm 0 Petri dishes. 
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Twenty-four hours after plating, cells were transfected by calcium 
phosphate method with 10 ng of the Pac I linearized adenoviral DNA. The DNA 
precipitate was left on the cells for 4 hours. The medium was removed and 5% 
HS/DMEM was added. 
5 Cells were kept in a C0 2 incubator until a cytopathic effect was visible 

(1 week). Cells and supernatant were recovered and subjected to 3X freeze/thawing 
cycles (liquid nitrogen / water bath at 37°C). The lysate was centrifuged at 3000 rpm 
at - 4°C for 20 minutes and the recovered supernatant (corresponding to a cell lysate 
containing virus passed on cells only once; PI) was used, in the amount of 1 ml/ dish, 
10 to infect 80-90% confluent Per.C6 in 10 cm 0 Petri dishes. The infected cells were 

incubated until a cytopathic effect was visible, cells and supernatant recovered and the 
lysate prepared as described above (P2). 

P2 lysate (4 ml) were used to infect 2 X 15 cm 0 Petri dishes. The 
lysate recovered from this infection (P3) was kept in aliquots at -80°C as a stock of 
15 virus to be used as starting point for big viral preparations. In this case, 1 ml of the 

stock was enough to infect 2 X 15 cm 0 Petri dishes and resulting lysate (P4) was used 
for the infection of the Petri dishes devoted to the large scale infection. 

Further amplification was obtained from the P4 lysate which was 
diluted in medium without FCS and used to infect 30 X 15 cm 0 Petri dishes (with 
20 Per.C6 80%-90% confluent) in the amount of 10 ml/dish. Cells were incubated 1 
hour in the C0 2 incubator, mixing gently every 20 minutes. 12 ml / dish of 5% HS / 
DMEM was added and cells were incubated until a cytopathic effect was visible 
(about 48 hours). 

Cells and supernatant were collected and centrifuged at 2K rpm for 20 
25 minutes at 4°C. The pellet was resuspended in 15 ml of 0.1 M Tris pH=8.0. Cells 

were lysed by 3X freeze/thawing cycles (liquid nitrogen / water bath at 37°C). 150 fil 
of 2 M MgCl2 and 75 jii of DNAse (10 mg of bovine pancreatic deoxyribonuclease I 
* in 10 ml of 20 mM Tris-HCl pH= 7.4, 50 mM NaCl, 1 mM dithiothreitol, 0.1 mg/ml 
bovine serum albumin, 50% glycerol) were added. After a 1 hour incubation at 37°C 
30 in a water bath (vortex every 15 minutes) the lysate was centrifuged at 4K rpm for 15 
minutes at 4°C. The recovered supernatant was ready to be applied on CsCl gradient. 
The CsCl gradients were prepared in SW40 ultra-clear tubes as 

follows: 

0.5 ml of 1.5d CsCl 
35 3 ml of 1.35dCsCl 
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3mlof L25d CsCl 

5-ml/ tube of viral supernatant was applied 

If necessary, the tubes were topped up with 0.1 M tris-Cl pH=8.0. 
Tubes were centrifuged at 35K rpm for 1 hour at -10°C with rotor SW40. The viral 
5 bands (located at the 1 .25/1 .35 interface) were collected using a syringe. 

The virus was transferred into a new SW40 ultracleak" tube and 1.35d 
CsCl was added to top the tube up. After centrifugation at 35K rpm for 24 hours at 
10°C in the rotor SW40, the virus was collected in the smallest possible volume and '< 
dialyzed extensively against buffer A105 (5 mM Tris, 5% sucrose, 75 mM NaCl, 1 
10 mM MgCl2, 0.005% polysorbate 80 pH=8.0). After dialysis, glycerol was added to 

final 10% and the virus was stored in aliqudts at - 80°C. 

Example 10: Enhanced Adenovector Rescue 

First generation Ad5 and Ad6 vectors carrying HCV NSOPTmut 

15 transgene were found to be difficult to rescue/ A possible block in the rescue process 
might be attributed to an inefficient replication of plasmid DNA that is a sub-optimal 
template for the replication machinery of adenovirus. The absence of the terminal 
protein linked to the 5'ends of the DNA (normally present in the viral DNA), 
associated with the very high G-C content of the transgene inserted in the El region of 

20 the vector, may be causing a substantial reduction in replication rate of the plasmid- 
derived adenovirus. 

To set up a more efficient and reproducible procedure for rescuing Ad 
vectors, an expression vector (pE2; Figure 19) containing all E2 proteins (polymerase* 
pre-terminal protein and DNA binding protein) as well as E4 orf6 under the control of 

25 tet-iriducible promoter was employed. The trarisfectiori of pE2 in combination with a 
normal preadeno plasmid in PerC6 and in 293 leads to a strong iricrease of Ad DNA 
replication and to a more efficient production of complete infectious adenovirus 
particles. 

30 Plasmid Construction 

pE2 is based on the cloning vector pBI (CLONTECH) with the 
addition of two elements to allow episomal replication arid selection in cell culture: 
(1) the EBV-OriP (EBV [nt] 7421-8042) region permitting pliasmid replication in 
synchrony with the cell cycle when EBNA^-1 is expressed arid (2) the hygiroiriycin-B 

35 phosphotransferase (HPH)-resi stance gene allowing a positive selection of 
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transformed cells. The two transcriptional units for the adenoviral genes E2 a and b 
and E4-Orf6 were constructed and assembled in pE2 as described below. 

The Ad5-Polymerase Clal/SphI fragment and the Ad5-pTP 
Acc65/EcoRV fragment were obtained from pVac-Pol and pVac-pTP (Stunnemberg et 
5 ah NAR 2(5:2431-2444, 1988). Both fragments were filled with Klenow and cloned 
into the Sail (filled) and EcoRV sites of pBI, respectively obtaining pBI-Pol/pTP. 

EBV-OriP element from pCEP4 (Invitrogen) was first inserted within 
two chicken P-globin insulator dimers by cloning it into BamHI site of pJC13-l 
(Chung et al. t Cell 74^:505-14, 1993). HS4-OriP fragment from pJC13^0riP was 

10 then cloned inside pSAlmv (a plasmid containing tk-Hygro-B resistance gene 

expression cassette as well as Ad5 replication origin), the ITR's arranged as head-to- 
tail junction, obtained by PCR from pFG140 (Graham, EMBO 7. 3:29 17-2922, 1984) 
using the following primers: 5 ' -TCG AATCG AT ACGCG A ACCT ACGG-3 * (SEQ. 
ID. NO. 16) and 5'-TCGACGTGTCGACTTCGAAGCGCACACCAAAAACGTC-3 , 

15 (SEQ. ID. NO. 17), thus generating pMVHS40rip. A DNA fragment from 

pMVHS40rip, containing the insulated OriP, Ad5 ITR junction and tk-HygroB 
cassette, was then inserted into pBI-Pol/pTP vector restricted Asel/Aatll generating 
P BI-Pol/pTPHS4 . 

To construct the second transcriptional unit expressing Ad5-Orf6 as 

20 well as Ad5-DBP, E4orf6 (Ad 5 [nt] 33193-34077) obtained by PCR was first 
inserted into pBI vector, generating pBI-Orf6. Subsequently, DBP coding DNA 
sequence (Ad 5 [nt] 22443-24032) was inserted into pBI-Orf6 obtaining the second 
bi-directional Tet-regulated expression vector (pBI-DBP/E4orf6). The original polyA 
signals present in pBI were substituted with BGH and SV40 polyA. 

25 pBI-DBP/E4orf6 was then modified by inserting a DNA fragment 

containing the Adeno5-TTRs arranged in head-to-tail junction plus the hygromicin B 
resistance gene obtained from plasmid pSA-lmv. The new plasmid pBI- 

* DBP/E4orf6shuttle was then used as donor plasmid to insert the second tet-regulated 
transcriptional unit into pBI-Pol/pTPHS4 by homologous recombination using E. coll 

30 strain BJ5 183 obtaining pE2. 

Cell lines, Transfections and Virus Amplification 

PerC6 cells were cultured in Dulbecco's modified Eagle's Medium 
(DMEM) plus 10% fetal bovine serum (FBS), 10 mM MgCl 2 , penicillin (100 U/ml), 
35 streptomycin (100 \ig/mi) and 2 mM glutamine. 
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All transient transfectioris were performed using Iipofectamine2000 
(In vitrogen) as described by the manufacturer. 90% confluent PERC.6™ planted in , 
6-cm plates were transfected with 3.5 jig of Ad5/6NSOPTmut pre-adeno plasmids, 
digested with Pad, alone or in combination with 5 jig pE2 plus 1 jig pUHD52.1. 
5 pUHD52. 1 is the expression vector for the reverse tet transacti vator 2 (rtTA2) 
(Urlinger et al, Proc, Natl Acad. Set U.S.A. 97(70:7963-7968, 2000). Upon 
transfectiori, cells were cultivated in the presence of 1 jig/ml of doxycycline to 
activate pE2 expression. 7 days post-transfection cells were harvested and cell lysate 
was obtained by three cycles of freeze-thaw. Two ml of cell lysate were used to infect 
10 a second 6-cm dish of PerC6. Infected cells were cultivated until a full GPE was 
observed then harvested. The virus was serially passaged five times as described 
above, then purified on CsCl gradient. The DNA structure of the purified virus was 
controlled by endonuclease digestion and agarose gel electrophoresis analysis and 
compared to the original pre-adeno plasmid restriction pattern. 

15 

Example 11: Partial Optimizeation of HCV Polvprotein Encoding Nucleic acid 

Partial optimization of HCV polyprotein encoding nucleic acid was 
performed to facilitate the production of adenovectors containing codons optimized 
for expression in a human host. The overall objective was to provide for increased 

20 Expression due to codon optimization, while facilitating the production of an 
adeno vector encoding HCV polyprotein. 

Several difficulties were encountered in producing an adenovector 
encoding HCV polyprotein with codons optimized for expression in a human host. An 
adenovector containing an optimized sequence (SEQ. ID. NO. 3) was found to be 

25 more difficult to synthesize and rescue than an adenovector containing a nori- 
optimized sequence (SEQ. ID. NO. 2). 

The difficulties in producing an adenovector containing SEQ. ID, NO. 
3 were attributed to a high GO content. A particularly problemetic region was the - 
region at about position 3900 of NSOPTmut (SEQ. ID. NO. 3). 

30 Alternative versions of optimized HCV encoding nucleic acid 

sequence were designed to facilitate its use in ail adenovector. The alternative 
versions, compared to NSOPTmut, were designed to have a lower overall GC content, 
to reduce/avoid the presence of potentially problematic motifis of consecutive G's or 
Cs, while maintaining a high level of codori optimization to allow improved 

35 expression of the encoded polyprotein and the individual cleavage products. 
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A starting point for the generation of a suboptimally codon-optimized 
sequence is the coding region of the NSOPTmut nucleotide sequence (bases 7 to 5961 
of SEQ. ID. NO. 3). Values for codon usage frequencies (normalized to a total of 1.0 
for each amino acid) were taken from the file humanjiigh.cod available in the 
5 Wisconsin Package Version 10.3 (Accelrys Inc., a wholly owned subsidiary of 
Pharmacopeia, Inc). 

To reduce the local and overall GC content a table defining preferred 
codon substitutions for each amino acid was manually generated. For each amino acid 
the codon having 1) a lower GC content as compared to the most frequent codon and 
10 2) a relativly high observed codon usage frequency (as defined in humanjiigh.cod) 
was choosen as the replacement codon. For example for Arg the codon with the 
highest frequency is CGC. Out of the other five alternative codons encoding Arg 
(CGG, AGG, AGA, CGT, CGA) three (AGG, CGT, CGA) reduce the GC content by 
1 base, one (AGA) by two bases and one (CGG) by 0 bases. Since the AGA codon is 
15 listed in humanjiigh.cod as having a relatively low usage frequency (0. 1), the codon 
substituting CGC was therefore choosen to be AGG with a relative frequency of 0. 1 8. 
Similar criteria were applied in order to establish codon replacements for the other 
amino acids resulting in the list shown in Table 5. Parameters applied in the following 
optimization procedure were determined empirically such that the resulting sequence 
20 maintained a considerably improved codon usage (for each amino acid) and the GC 
content (overall and in form of local stretches of consecutive G's and/or C's) was 
decreased. 

Two examples of partial optimized HCV encoding sequences are 
provided by SEQ. ID. NO. 10 and SEQ. ID. NO. 1 L SEQ. ID. NO. 10 provides a 
25 HCV encoding sequence that is partially optimized throughout. SEQ. ID. NO. 11 
provides an HCV encoding sequence fully optimized for codon usage with the 
exception of a region that was partially optimized. 
A Codon optimization was performed using the following procedure: 

Step 1) The coding region of the input fully optimized NSOPTmut 
30 sequence was analyzed using a sliding window of 3 codons (9 bases) shifting the 
window by one codon after each cycle. Whenever a stretch containing 5 or more 
consecutive C's and/or G's was detected in the window the following replacement rule 
was applied: Let N indicate the number of codon replacements previously performed. 
If N is odd replace the middle codon in the window with the codon specified in Table 
35 5, if N is even replace the third terminal codon in the window with the codon 
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specified in a codon optimization table such as humanjiighxod. If Leu or Val is 
present at the second or third codon do not apply any replacement in order riot to 
introduce Leu or Val codons with very low relative codon usage frequency (see, for 
example, human_.high.cod). In the following cycle analysis of the shifted window wais 
5 then applied to a sequence containing the replacements of the previous cycle. 

The alternating replacement of the middle and terminal codon in the 3 
codon window was found empirically to give a more satisfying overall maintenance of 
optimized codbri usage while also reducing GC content (as judged from the final 
sequence after the procedure). In general, however, the precise replacement strategy 

10 depends on the amino acid sequence encoded by the nuceiotide sequence under 
analysis arid will have to be determined empirically. 

Step 2) The sequence containing all the codon replacements performed 
during step 1) was then subjected to an additional analysis using a sliding window of 
21 codoris (63 bases) in length: according to an adjustable parameter the overall GC 

1 5 content in the window was determined. If the GC content in the window was higher 
than 70% the following codori replacement strategy was applied: In the window 
replace the codoris for the amino acids Asn, Asp, Cys, Glu, His, He, Lys, Phe, Tyr by 
the codons given in Table 5. Restriction of the replacement to this set of amino acids 
was motivated by the fact that a) the replacement codon still has an accetably high 

20 frequency of usage in humarijiigh.cod and b) the average overall human codon usage 
in CUTG for the replacement codon is nearly as high as the most frequent codon. In 
the following cycle analysis of the shifted window is then applied to a sequence 
containing the replacements of the previous cycle. 

The threshold 70% was determined empirically by comproriiising 

25 between an overall reduction in GC content and maintenance of a high codon 

optimization for the individual arriirio acids. As in step 1) the precise replacement 
strategy (choice of amino acids and GC content threshold value) will again depend on 
the amino acid sequence encoded by the nucleotide sequence under analysis arid will r 
have to be determined empirically. 

30 Step 3) The sequence generated by steps 1) and 2) was then manually 

edited arid additional codons were changed according to the following criteria: 
Regions still having a GC content higher than 70% over a window of 21 codons were 
exaimiried manually arid a few codons were replaced again following the scheme given 
in Table 5. 
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Subsequent steps were performed to provide for useful restriction sites, 
remove possible open reading frames on the complementary strand, to add 
homologous recombinant regions, to add a Kozac signal, and to add a terminator. 
These steps are numbered 4-7 
5 Step 4) The sequence generated in step 3 was examined for the absence 

of certain restriction sites (Bgffl, Pmel and Xbal) and presence of only 1 StuI site to 
allow a subsequent cloning strategy using a subset of restriction enzymes. Two sites 
(one for Bgin and one for StuI) were removed from the sequence by replacing codons 
that were part of the respective recognition sites. 
10 Step 5) The sequence generated by steps 1) through 4) was then 

modified according to allow subsequent generation of a modified NSOPTmut 
sequence (by homologous recombination). In the sequence obtained from steps 1) 
through 4) the segment comprising base 3556 to 3755 and the segment comprising 
base 4456 to 4656 were replaced by the corresponding segments from NSOPTmut. 
15 The segment comprising bases 3556 to 4656 of SEQ. ID. NO. 10 can be used to 

replace the problematic region in NSOPTmut (around position 3900) by homologous 
recombination thus creating the variant of NSOPTmut having the sequence of SEQ. 
ED. NO. 1L 

Step 6) Analysis of the sequence generated through steps 1) to 5) 

20 revealed a potential open reading frame spanning nearly the complete fragment on the 
complementary strand. Removal of all codons CTA and TTA (Leu) and TCA (Ser) 
from the sense strand effectively removed all stop codons in one of the reading frames 
on the complementary strand. Although the likelyhood for transcription of this 
complementary strand open reading frame and subsequent translation into protein is / 

25 very small, in order to exclude a potential interference with the transcription and 

subsequent translation of the sequence encoded on the sense strand, TCA codons for 
Ser were introduced on the sense approximately every 500 bases. No changes were 
introduced in the segments introduced during step 5) to allow homologous 
recombination. The TCA codon for Ser was preferred over the CTA and TTA codons 

30 for Leu because of the higher relative frequency for TCA (0.05) as compared to CTA 
(0.02) and TTA (0.03) in humanjiigh.cod. In addition, the average human codon 
usage from CUTG favored TCA (0. 14 against 0.07 for CTA and TTA). 

Step 7) In a final step GCCACC was added at the 5' end of the 
sequence to generate an optimized internal ribosome entry site (Kozak signal) and a 

35 TAAA stop sgnal was added at the 3'. To maintain the initiation of translation 
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properties of NSsuboptmut the first 8 codons of the coding region were kept identical 
to the NSOPTmut sequence. The resulting sequence was again checked for the 
absence of Bglll, Pmel and Xbal recognition sites and the presence of only 1 StuI site. 

The NSsuboptmut sequence (SEQ. ID. NO. 10) has an overall reduced 
5 GC content (63.5%) as compared to NSOPTmut (70.3%) and maintains a well 
optimized level of codon usage optimization. Nucleotide sequence identity of 
NSsuboptmut is 77.2% with respect to NSmut 

Table 5: Definition of codon replacements performed during steps 1) and 2). 



Amino Acid 


Most frequent 
codon 


Relative 
frequency 


Reduction in 
GC content 
(bases) 


Replacement 
codon 


Relative 
frequency 


Amino Acids where the replacement codon reduces the codon GOcontent by 1 base 


Ala 


GCC 


0.51 




GCT 


0.17 


Arg 


CGC 


0.37 




AGG 


0.18 


Asn 


AAC 


0.78 




AAT 


0.22 


Asp 


GAC 


0.75 




GAT 


0.25 


Cys 


TGC 


0.68 




TGT 


0.32 


Glu 


GAG 


0.75 




G AA 


0.25 


Gin 


CAG 


0.88 




CAA 


0.12 


Gly 


GGG 


0.50 




GGA 


0.14 


His 


CAC 


0.79 




CAT 


0.21 


lie 


ATC 


0.77 




ATT 


0.18 


Lys 


AAG 


0.82 




AAA 


0.18 


Phe 


TTC 


0.80 




TTT 


0.20 


Pro 


ccc 


0.48 




CCT 


0.19 , 


Ser 


AGC 


0.34 




TCT 


0.13 


Thr 


ACC 


0.51 




ACA 


0.14 , 


Tyr 


TAC 


0.74 




TAT 


0.26 


Amino Acids with no alternative codon 


Met 


ATG 


1.00 


0 


ATG 


1.00 


Trp 


TGG 


1.06 


0 


TGG 


1.00 
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Amino Acids where the replacement codon has a very low relative frequency. These amino acids were 




excluded from the replacement procedure 




Leu 


CTG 


0.58 


1 


TTG 


0.06 


Val 


GTG 


0.64 


1 


GTT 


0.07 



Example 12: Virus Characterization 

Adenovectors were characterized by: (a) measuring the physical 
particles/ml; (b) running a TaqMan PCR assay; and (c) checking protein expression 
5 after infection of HeLa cells. 

a) Physical Particles Determination 

CsCl purified virus was diluted 1/10 and 1/100 in 0.1% SDS PBS. As 
a control, buffer A105 was used. These dilutions were incubated 10 minutes at 55°C. 
10 After spinning the tubes briefly, O.D. at 260 nm was measured. The amount of viral 
particles was calculated as follows: 1 OD 260 nm = 1.1 X 10 12 physical particles/ml. 
The results were typically between 5 X 10 u and 1 X 10 12 physical particles /ml. 

b) TaqMan PCR Assay 

15 TaqMan PCR assay was used for adenovectors genome quantification 

(Q-PCR particles/ml). TaqMan PCR assay was performed using the ABI Prism 7700- 
sequence detector. The reaction was performed in a final 50 (il volume in the 
presence of oligonucleotides (at final 200 nM) and probe (at final 200 fiM) specific 
for the adenoviral backbone. The virus was diluted 1/10 in 0.1% SDS PBS and 

20 incubated 10 minutes at 55°C. After spinning the tube briefly, serial 1/10 dilutions (in 
water) were prepared. 10 jrt the 10 3 , 10 s and 10* 7 dilutions were used as templates in 
the PCR assay. 

■ The amount of particles present in each sample was calculated on the 

basis of a standard curve run in the same experiment. Typically results were between 
25 1 X 10 12 and 3 X 10 12 Q-PCR particles /ml. 

c) Expression of HCV Non-Structural Proteins 

Expression of HCV NS proteins was tested by infection of HeLa cells. 
Cells were plated the day before the infection at 1.5 X 10 6 cells/dish (10 cm 0 Petri 
30 dishes). Different amounts of CsCl purified virus corresponding to m.o.L of 50, 250 
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and 1250 pp/cell were diluted in medium (PCS free) up to a final volume of 5 ml. The 
diluted virus was added on the cells and incubated for 1 hour at 37°C in a CO2 
incubator (gently mixing every 20 minutes). 5 ml of 5% HS-DMEM was added and 
the cells were incubated at 37°C for 48 hours. 
5 Cell extracts were prepared in 1% Triton/TEN buffer. The extracts 

were run on 10% SDS-acrylamide gel, blotted on nitrocellulose and assayed with 
antibodies directed against NS3, NS5a and NS5b in order to check the correct 
polyprotein cleavage. Mock-infected cells were used as a negative control. Results 
from representative experiments testing the Ad5-NS, MRKAdS-NSmut, MRKAd6- 
10 NSmut and MRKAd6-NSOPTmut are shown in Figure 14. 

Example 13: Mice Immunization with Adenovectors Encoding Different NS 
Cassettes 

The adenovectors Ad5-NS, MRKAdS-NSmut, MRKAd6-NSmut and 
15 MRKAd6-NSOPTmut were injected in C57Black6 mice strains to evaluate their 
potential to elicit anti-HCV immune responses. Groups of animals (N=9-10) were 
injected intramuscularly with 109 pp of CsCl purified virus. Each animal received two 
doses at three weeks interval. 

Humoral immune response against the NS3 protein was measured in 
20 post dose two sera from C57Black6 immunized mice by ELIS A on bacterially 
expressed NS3 protease domain. Antibodies specific for the tested antigen were 
detected with geometric mean titers (GMT) ranging from 100 to 46000 (Tables 6, 7, 8 
and 9). 

25 Table 6: AdS-NS 





GMT 


Mice n. 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 




Titer 


50 


253 


50 


50 


50 


2257 


504 


50 


50 


50 


108 : 



3d 
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Table 7: Ad5-NSmut 







GMT 


Mice 
n. 


11 


12 


13 


14 


15 


16 


17 


18 


19 


20 




Titer 


3162 


78850 


87241 


6796 


12134 


3340 


18473 


13093 


76167 


49593 


23645 



Table 8: MRKAd6-NSmut 



5 





GMT 


Mice 
n. 


21 


22 


23 


24 


25 


26 


27 


28 


29 


30 




Titer 


125626 


39751 


40187 


65834 


60619 


69933 


21555 


49348 


29290 


26859 


46461 



Table 9: MRKAd6-NSOPTmut 





GMT 


Mice n. 


31 


32 


33 


34 


35 


36 


37 




Titer 


25430 


3657 


893 


175 


10442 


49540 


173 


2785 



10 T cell response in C57Black6 mice was analyzed by the quantitative 

EUSPOT assay measuring the number of IFNy secreting T cells in response to five 
pools (named from F to L+M) of 20mer peptides overlapping by ten residues 
encompassing the NS3-NS5B sequence. Specific CD8+ response induced in 
C57Black6 mice was analyzed by the same assay using a 20mer peptide 

15 encompassing a CD8+ epitope for C57Black6 mice (pepl480). Cells secreting IFNy 
in an antigen specific-manner were detected using a standard ELIspot assay. 

Spleen cells, splenocytes and peptides were produced and treated as 

A described in Example 3, supra. Representative data from groups of C57Black6 mice 
(N=9-10) immunized with two injections of 10 9 viral particles of vectors Ad5-NS, 

20 MRKAd5-NSmut and MRKAd6-NSmut are shown in Figure 15. 

Example 14: Immunization of Rhesus macaques with Adenovectors 

Rhesus macaques (N=3-4) were immunized by intramuscular injection 
of CsCl purified Ad5-NS, MRKAd5-NSmut, MRKAd6-NSmut or MRKAd6- 
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NSOPTmut virus. Each animal received two doses of 10 n or 10 10 vp in the deltoid 

muscle at 0, and 4 weeks. 

CMI was measured at different time points by a) IFN-y ELISPOT (see 

Example 3, supra), b) IFN-y ICS and c) bulk CTL assays. These assays measure HCV 
5 antigen-specific CD8+ and CD4+ T lymphocyte responses, and can be used for a 

variety of mammals, such as humans, rhesus monkeys, mice, and rats. 

The use of a specific peptide or a pool of peptides can simplify antigen 

presentation in CTL cytotoxicity assays, interferon-gamma ELISPOT assays and 

interferon-gamma intracellular staining assays. Peptides based on the amino acid 
10 sequence of various HCV proteins (core, E2, NS3, NS4A, NS4B, NS5a, NS5b) were 

prepared for use in these assays to measure immune responses in HCV DNA and 

adenovirus vector vaccinated rhesus monkeys, as well as in HCV-infected humans. 

The individual peptides are overlapping 20-mers, offset by 10 amino acids. Large; ; 

pools of peptides can be used to detect an overall response to HCV proteins while 
15 smaller pools and individual peptides may be used to define the epitope specificity of 

a response. 

IFN-y ICS 

For IFN-Y ICS, 2 x 106 PBMC in 1 ml R10 (RPMI medium, 
20 supplemented with 10% FCS) were stimulated with peptide pool antigens. Final 

concentration of each peptide was 2 ^tg/ml. Cells were incubated for 1 hour in a CO2 
incubator at 37°C and then Brefeldin A was added to a final concentration of 10 ng 
/ml to inhibit the secretion of soluble cytokines. Cells were incubated for additional 
14-16 hours at 37°C. 

25 Stimulation was done in the presence of co-stimulatory antibodies: 

CD28 and CD49d (anti-humanCD28 BD340975 and anti-hurrianCD49d BD340976): 
After incubation, cells were stained with fluorochrome-conjugated antibodies for 
surface antigens: anti-CD3, anti-CD4, anti-CD8 (CD3-APC Biosource APS0301, 
CD4-PE BD345769, CD8-PerCP BD345774), 

30 To detect intracellular cytokines, cells were treated with FACS 

permeabilization buffer 2 (BD340973), 2x final concentration. Once fixed and 
penrieabilized, cells were incubated with an antibody against human IFN-y, IFN- 
YFITC (Biosource AHC4338). 

Cells were resuspended in 1% formaldehyde in PBS and analyzed at 

35 FACS within 24 hours. Four color FACS analysis was performed on a FACSCalibur 
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instrument (Becton Dickinson) equipped with two lasers. Acquisition was done 
gating on the lymphocyte population in the Forward versus Side Scatter plot coupled 
with the CD3, CD8 positive populations. At least 30,000 events of the gate were 
taken. The positive cells are expressed as number of IFN-y expressing cells over 10 6 
5 lymphocytes. 

IFN-y ELISPOT and IFN-y ICS data from immunized monkeys after 
one of two injections of 10 10 or 10 n vp of the different adenovectors are reported in 
Figures 16A-16D, 17 A, and 17B. 

10 Bulk CTL Assays 

A distinguishing effector function of T lymphocytes is the ability of 
subsets of this cell population to directly lyse cells exhibiting appropriate MHC- 
associated antigenic peptides. This cytotoxic activity is most often associated with 
CD8+ T lymphocytes. 

15 PBMC samples were infected with recombinant vaccine viruses 

expressing HCV antigens in vitro for approximately 14 days to provide antigen 
restimulation and expansion of memory T cells. Cytotoxicity against autologous B 
cell lines treated with peptide antigen pools was tested. 

The lytic function of the culture is measured as a percentage of specific 

20 lysis resulted from chromium released from target cells during 4 hours incubation 

with CTL effector cells. Specific cytotoxicity is measured and compared to irrelevant 
antigen or excipient-treated B cell lines. This assay is semi-quantitative and is the 
preferred means for determining whether CTL responses were elicited by the vaccine. 
Data after two injections from monkeys immunized with 10 11 vp/dose with 

25 adenovectors Ad5-NS, MRKAd5-NSmut and MRKAd6-NSmut are reported in 
Figures 18A-18F. 

Other embodiments are within the following claims. While several 
embodiments have been shown and described, various modifications may be made 
30 without departing from the spirit and scope of the present invention. 
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WHAT IS CLAIMED IS: 

1. A nucleic acid comprising a nucleotide sequence encoding a, 
Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide substantially similar to S EQ ID 

5 NO: 1, provided that said polypeptide has sufficient protease activity to process itself 
to produce an NS5B protein and said NS5B protein is enzymatically inactive. 

2. The nucleic acid of claim 1, wherein said nucleotide sequence 
is substantially similar to the coding sequence of SEQ ED NO: 2. 

10 

3. The nucleic acid of claim 1, wherein said nucleotide sequence 
encodes for the polypeptide of SEQ ID NO: 1. 

4. The nucleic acid of claim 3, wherein said nucleotide sequence 
15 is the coding sequence of either SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 10, or 

SEQ ID NO: 11. 

5. The nucleic acid of claim 3 t wherein said nucleotide sequence 
is the coding sequence of either SEQ ID NO: 2 or SEQ ID NO: 3. 

20 . . - ' . •• ./ ' 

6. The nucleic acid of any one of claims 1-5, wherein said nucleic 
acid is an expression vector capable of expressing said polypeptide from said 
nucleotide sequence in a human cell. 

25 7. A nucleic acid comprising a gene expression cassette able to L 

express a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide substantially similar to 
SEQ ID NO: 1 in a human cell, provided that said polypeptide can process itself to 
produce an NS5B protein and said NS5B protein is enzymatically inactive, said 
expression cassette comprising: 

30 a) a promoter transcriptionally coupled to a nucleotide sequence 

encoding said polypeptide; 

b) a 5' ribosome binding site functionally coupled to said nucleotide 

sequence, 
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c) a terminator joined to the 3' end of said nucleotide sequence, and 

d) a 3* polyadenylation signal functionally coupled to said nucleotide 

sequence. 

5 8. The nucleic acid of claim 7, wherein said nucleotide sequence 

is substantially similar to either SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 10, or 
SEQ ID NO: 11. 

9. The nucleic acid of claim 8, wherein said nucleic acid is a 

10 shuttle vector further comprising a selectable marker, an origin of replication, a first 
adenovirus homology region and a second adenovirus homology region flanking said 
expression cassette, wherein said first homology region has at least about 100 base 
pairs substantially homologous to at least right end of a wild-type adenovirus region 
from about base pairs 1-425, and said second homology region has at least about 100 ; 

15 base pairs substantially homologous to at least the left end of a wild-type adenovirus 
region from about base pairs 3511-5792 of Ad5 or corresponding region of another 
adenovirus. 

10. The nucleic acid of claim 9, wherein said nucleotide sequence 
20 encodes for a polypeptide of SEQ ID NO: 1. 

11. The nucleic acid of claim 9, wherein said nucleotide sequence 
is SEQ ID NO: 2. 

25 12. The nucleic acid of claim 9, wherein said nucleotide sequence 

is either SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 1 1 . 

13. The nucleic acid of claim 8, wherein said nucleic acid is a 
plasmid suitable for administration into a human and further comprises a prokaryotic 

30 origin of replication and a gene coding for a selectable marker. 

14. The nucleic acid of claim 13, wherein said nucleotide sequence 
encodes for a polypeptide of SEQ ID NO: 1. 
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15. The nucleic acid of claim 14, wherein said nucleotide sequence 
is the coding sequence of either SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 10, or ; 
SEQIDNO:ll. 

5 16. The nucleic acid of claim 14, wherein said nucleotide sequence 

is the coding sequence of SEQ ID NO: 2 or SEQ ID NO: 3. 

17. The nucleic acid of claim 14, wherein said promoter is the 
human intermediate early cytomegalovirus promoter (intron A), said 5* ribosome 

10 binding site consists of SEQ ID NO: 12, and said 3' polyadenylation is the bovine 
growth hormone (BGH) polyadenylation signal. 

18. The nucleic acid of claim 8, wherein said nucleic acid is a 
adenovirus genome plasmid comprising a selectable marker, an origin of replication, 

15 and a recombinant adenovectof genome containing an El deletion, an E3 deletion, 
and said expression cassette. 

19. The nucleic acid of claim 8, wherein said nucleic acid is a 
adenovirus genome plasmid comprising a selectable marker, an origin of replication, 

20 and 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 

b) said gene expression cassette in a El parallel or El anti-parallel 
orientation joined to said first region; 

25 c) a second adenovirus region from about base pair 35 1 1 to about .> 

base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28 1 33 corresponding to Ad5 or from about base pair 5542 to about base pair 

30 28156 corresponding to Ad6, joined to said second region; 

e) a fourth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said third region; and 
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f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to said fourth region. 

5 20. The nucleic acid of claim 19, wherein said first region 

corresponds to Ad5, said second region corresponds to Ad5, said third region 
corresponds to Ad5, said fourth region corresponds to Ad5, and said fifth region 
corresponds to Ad5. 

10 21. The nucleic acid of claim 20, wherein said promoter is the 

human intermediate early cytomegalovirus promoter, said 5' ribosome binding site 
consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation 
signal. 

15 22. The nucleic acid of claim 21, wherein said expression cassette 

is in an El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 
2, SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 11. 

23. The nucleic acid of claim 19, wherein said first region 

20 corresponds to Ad5 or Ad6, said second region corresponds to Ad5 or Ad6, said third 
region corresponds to Ad6, said fourth region corresponds to Ad6, and said fifth 
region corresponds to Ad5 or Ad6. 

24. The nucleic acid of claim 23, wherein said promoter is the 
25 human intermediate early cytomegalovirus promoter, said 5' ribosome binding site 

consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation 
signal. 

25. The nucleic acid of claim 24, wherein said expression cassette 
30 is in an El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 

2, SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 11. 

26. The nucleic acid of claim 24, wherein said expression cassette 
is in an El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 

35 2orSEQIDNO:3. 
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27. The nucleic acid of claim 8^ wherein said nucleic acid is a 
adenovirus genome plasmid comprising an origin of replication* a selectable marker, 
and: 

5 a) a first adenovirus region from about base pair 1 to about base 

pair 450 corresponding to either Ad5 or Ad6; 

b) a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said first region; 
10 c) a third adenovirus region from about base pair 5549 to about 

base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to said second region; 

d) said gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to said third region; 
15 e) a fourth adenovirus region from about base pair 30818 to about 

base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base ' 
20 pair 35759 corresponding to Ad6, joined to said fourth region^ 

28. The nucleic acid of claim 27, wherein said first region 
corresponds to Ad5, said second region corresponds to Ad5, said third region 
correstpdnds to Ad5, said fourth region corresponds to Ad5, and said fifth region 

25 corresponds to Ad5. 

29. The nucleic acid of claim 28* wherein said promoter is the 
hiimaih intermediate early cytomegalovirus promoter, said 5* ribosome binding site 
consists of SEQ ED NO: 12, and said 3 s polyadenylation is the BGH polyadenylation 

30 signal. 

30. The nucleic acid of claim 27, wherein said first region 
corresponds to Ad5 or Ad6, said second region corresponds to Ad5 of Ad6, said third 
region corresponds to Ad6, said fourth region corresponds to Ad6, and said fifth 

35. region corresponds to Ad5 or Ad6. 
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31. The nucleic acid of claim 30, wherein said promoter is the 
human intermediate early cytomegalovirus promoter, said 5' ribosome binding site 
consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation 

5 signal. 

32. The nucleic acid of claim 8, wherein said nucleic acid is a 
adenovector consisting of a nucleotide sequence substantially similar to of SEQ ID 
NO. 4 or a derivative thereof, wherein said derivative thereof has the HCV 

10 polyprotein encoding sequence present in SEQ ID NO: 4 replaced with the HCV 

polyprotein encoding sequence of either SEQ ID NO: 3, SEQ ID NO: 10 or SEQ ED 
NO: 11. 

33. The nucleic acid of claim 8, wherein said nucleic acid is an 

15 adenovector having an adenovector genome containing an El deletion, an E3 deletion, 
and said expression cassette 

34. The nucleic acid of claim 8, wherein said nucleic acid is an 
adenovector consisting of : 

20 a) a first adenovirus region from about base pair 1 to about base 

pair 450 corresponding to either Ad5 or Ad6; 

b) said gene expression cassette in a El parallel or El an ti -parallel 
orientation joined to said first region; 

c) a second adenovirus region from about base pair 3511 to about 
25 base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

5541 corresponding to Ad6, joined to said expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28 156 corresponding to Ad6, joined to said second region; 

30 e) a fourth adenovirus region from about base pair 30818 to about 

base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said third region; and 

0 a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

35 pair 35759 corresponding to Ad6, joined to said fourth region. 
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35. The nucleic acid of claim 34, wherein said first region 
corresponds to Ad5, said second region corresponds to Ad5, said third region 
corresponds to Ad5, said fourth region corresponds to AdS/and said fifth region 

5 corresponds to Ad5. 

36. The nucleic acid of claim 35, wherein said promoter is the 
human intermediate early cytomegalovirus promoter, said 5' ribosbme binding site 
consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation 

10 signal. 

37. The nucleic acid of claim 36, wherein said expression cassette 
is in ail El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 
2, SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 11. 

15 

38. The nucleic acid of claim 34, wherein said first region 
corresponds to Ad5 or Ad6, said second region corresponds to Ad5 or Ad6, said third 
region corresponds to Ad6* said fourth region corresponds to Ad6, and said fifth 
region corresponds to Ad5 or Ad6. 

20 ■ • ' • 

39. The nucleic acid of claim 37, where said promoter is the human 
intermediate early cytomegalovirus promoter, said 5* ribosome binding site consists 
of SEQ ID NO: 12, and said 3* polyadenylation is the BGH polyadenylation signal, 

25 40. The nucleic acid of claim 39, wherein said expression cassette 

is in an El anti parallel orientation and said nucleotide sequence is SEQ ID NO: 2, 
SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 11. 

41. The nucleic acid of claim 39* wherein said expression cassette 
30 is in an El anti parallel orientation and said nucleotide sequence is SEQ ID NO: 2 or 

SEQ ID NO: 3. 

42. The nucleic acid of claim 8, wherein said nucleic acid is an ]\ 
adenovector consisting of: 
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a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 

b) a second adenovirus region from about base pair 35 11 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

5 5541 corresponding to Ad6, joined to said first region; 

c) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
281 56 corresponding to Ad6, joined to said second region; 

d) said gene expression cassette in a E3 parallel or E3 anti-parallel 

10 orientation joined to said third region; 

e) a fourth adenovirus region from about base pair 308 18 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
15 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

pair 35759 corresponding to Ad6, joined to said fourth region. 

43. The nucleic acid of claim 42, wherein said first region 
corresponds to Ad5, said second region corresponds to Ad5, said third region 

20 corresponds to Ad5, said fourth region corresponds to Ad5, and said fifth region 
corresponds to Ad5. 

44. The nucleic acid of claim 42, wherein said first region 
corresponds to Ad5 or Ad6, said second region corresponds to Ad5 or Ad6, said third 

25 region corresponds to Ad6, said fourth region corresponds to Ad6, and said fifth 
region corresponds to Ad5 or Ad6. 

45. An adenovector consisting of the nucleic acid sequence of SEQ 
ED NO. 4 or a derivative thereof, wherein said derivative thereof has the HCV 

30 polyprotein encoding sequence present in SEQ ID NO: 4 replaced with the HCV 

polyprotein encoding sequence of either SEQ ID NO: 3, SEQ ID NO: 10 or SEQ ID 
NO: 11. 

46. An adenovector produced by a process comprising the steps of: 
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a) producing an adenovirus genome plasmid by homologous 
recombination between the shuttle vector of claim 9 and a nucleic acid comprising; 

a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 
5 a second adenovirus region from about base pair 35 1 1 to about 

base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said first region; 

a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair; 
10 28156 corresponding to Ad6, joined to said second region; 

a fourth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said third region; and 

a fifth adenovirus region from about base pair 33967 tp about 
15 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to said fourth region; and 

b) rescuing said adenovector from said adenovirus plasmid. 



47. A cultured recombinant cell comprising the nucleic acid of 



20 claim 6. 



48. A cultured recombinant cell comprising the nucleic acid of any 
one of claims 9-46. 

25 49. A method of making an adenovector comprising the steps of: 

a) producing an adenovirus genome plasmid comprising a gene 
expression cassette by homologous recombination between the nucleic acid of claim 9 
and a nucleic acid comprising; 

a first adenovirus region from about base pair 1 to about base 
30 pair 450 con-esponding to either Ad5 or Ad6; 

a second adenovirus region from about base pair 35 1 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said first region; 
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a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to said second region; 

a fourth adenovirus region from about base pair 30818 to about 
5 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said third region; and 

a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region; and 
10 b) rescuing said recombinant adenovirus from said recombinant 

adenovirus plasmid. 

50. A pharmaceutical composition comprising the nucleic acid of 
any one of claims 13-17 and 32-46 and pharmaceutically acceptable carrier. 

15 

5 1 . A method of treating a patient comprising the step of 
administering to said patient an effective amount of the nucleic acid of any one of 
claims 13-17 and 32-46. 

20 52. The method of claim 51, wherein said patient is a human. 

53. The method of claim 52, wherein said patient is not infected 

withHCV. 

25 54. The method of claim 52, wherein said patient is infected with 

HCV. 

* 55. A recombinant nucleic acid comprising one or more Ad6 

regions and a region not present in Ad6, wherein at least one Ad6 region is selected 
30 from the group consisting of: E1A, E1B, E2B, E2A, E4, LI, L2, L4, and L5. 

56. The recombinant nucleic acid of claim 55, wherein said region 
not present in Ad6, is an expression cassette coding for a polypeptide not found in 
Ad6. 

35 
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57. The recombinant nucleic acid of claim 56, wherein said 
recombinant nucleic acid is an adenovirus vector defective in at least El that is able to 
replicate when El is supplied in trans. 

5 58. The recombinant nucleic acid of claim 57, wherein said vector 

consists of: ; , 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 

b) said gene expression cassette in an El parallel or El anti- 
10 parallel orientation joined to said first region; 

c) a second adenovirus region from about base pair 35 1 1 to about: 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said gene expression cassette; 

d) a third adenovirus region from about base pair 5549 to about • 
15 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 

28156 corresponding to Ad6, joined to said second region; 

e) an optionally present fourth region from about base pair 28134 
to about base pair 30817 corresponding to Ad5, or from about base pair 28157 to 
about 30789 corresponding to Ad6, joined to said third region; 

20 f) a fifth adenovirus region from about base pair 308 1 8 to about 

base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, wherein said fifth region is joined to said fourth " ■ 
region if said fourth region is present, 6r said fifth is joined to said third region if said 
fourth region is not present; and 

25 g) a sixth adenovirus region from about base pair 33967 to about 

base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base > 

• V; -'A 1 

pair 35759 corresponding to Ad6, joined to said fourth region; 

■** * 

provided that at least one of said second, third, and fifth regions is 

from Ad6. 



30 



59. The recombinant nucleic acid of claim 57, wherein said vector 

consists of: 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 
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b) a second adenovirus region from about base pair 3511 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said first region; 

c) a third adenovirus region from about base pair 5549 to about 

5 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28 156 corresponding to Ad6, joined to said second region; 

d) said gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to said third region; 

e) a fourth adenovirus region from about base pair 30818 to about 
10 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to said gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to said fourth region; 

15 provided that at least one of said second, third, and fourth regions is 

from Ad6. 
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EACKLTPPHS AKSKFGYGAIC 
TIMAKNEVFC VQPEKGGRKP 
MGSSYGFQYS PGQRVEFLVN 
ESIYQCCDLA PEAR^AIKSL 
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1701 KLQDCTMLVN AAGLWICES AGTQEDAASL RVFTEAMTRY SAPPGDPPQP 

1751 EYDLELITSC SSNVSVAHDA SGKRVYYLTR DPTTPLARAA WETARHTPVN 

1801 SWLGNIIMYA PTLWARMILM THFFSILLAQ EQLEKALDCQ IYGACYSIEP 

1851 LDLPQIIERL HGLSAFSLHS YSPGEINRVA SGLRKLGVPP LRVWRHRARS 

1901 VRARLLSQGG RAATCGKYLF NWAVKTKLKL TPIPAASQLD LSGWFVAGYS 

1951 GGDIYHSLSR ARPRWFMLCL LLLSVGVGIY LLPNR 
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1701 AGCCACGGTG TGCGCCAGGG CTCAGGCCCG ACCTCCATCA TGGGATC AAA 

1751 TGTGGAAGTG TCTCATACGG CTGAAACCTA CGCTGCACGG GCCAACACCC 

1801 TTGCTGTACA GGCTGGGAGC CGTCCfAAAT GAGGTCACCC TCACCCACCC 

1851 CATAACCAAA TACATCATGG CATGCATGTC GGCTGACCTG GAGGTCGTCA 

1901 CTAGCACCTG GGTGCTGGTG GGCGGAGTCC TTGCAGCTCT GGCCGCGTAT 

1951 TGCCTGACAA CAGGCAGTGT GGTCATTGTG GGTAGGATTA TCTTGTCCGG 

2001 GAGGCCGGCT ATTGTTCCCG ACAGGGAGTT TCTCTACCAG GAGTTCGATG 

2051 AAATGGAAGA GTGCGCCTGG CACCTCCCTT ACATCGAGCA GGGAATGCAG 

2101 CTCGCCGAGC AATTCAAGCA GAAAGCGCTC GGGTTACTGC AAACAGCCAC 

2151 CAAACAAGCG GAGGCTGCTG CTCCCGTGGT GGAGTCCAAG TGGCGAGCCC 

2201 TTGAGACATT CTGGGCGAAG CACATGTGGA ATTTCATCAG CGGGATACAG 

2251 TACTTAGCAG GCTTATCCAC TCTGCCTGGG AACCCCGCAA TAGCATCATT 

2301 GATGGCATTC ACAGCCTCTA TCACCAGCCC GCTCACCACC CAAAGTACCC 

2351 TCCTGTTTAA CATCTTGGGG GGGTGGGTGG CTGCCCAACT CGCCCCCCCC 
2401 . AGCGCCGCTT CGGCTTTCGT GGGCGCCGGC ATCGCCGGTG CGGCTGTTGG 

2451 CAGCATAGGC CTTGGGAAGG TGCTTGTGGA CATTCTGGCG GGTTATGGAG 

2501 CAGGAGTGGC CGGCGCGCTC GTGGCCTTCA AGGTCATGAG CGGCGAGATG 

2551 GCCTCCACCG AGGACCTGGT CAATCTACTT CCTGCCATCC TCTCTCCTGG 

2601 CGCCCTGGTC GTCGGGGTCG TGTGTGCAGC AATACTGCGT CGACACGTGG 

2651 GTCCGGGAGA GGGGGCTGTG CAGTGGATGA ACCGGCTGAT AGCGTTCGCC 

2701 TCGCGGGGTA ATCATGTTTC CCCCACGCAC TATGTGCCTG AGAGCGACGC 

2751 CGCAGCGCGT GTTACTCAGA TCCTCTCCAG CCTTACCATC ACTCAGCTGC 

2801 TGAAAAGGCT CCACCAGTGG ATTAATGAAG ACTGCTCCAC ACCGTGTTCC 

2851 GGCTCGTGGC TAAGGGATGT TTGGGACTGG ATATGCACGG TGTTGACTGA 

2901 CTTCAAGACC TGGCTCCAGT CGAAGCTCCT GCCGCAGCTA CCGGGAGTCC 

,2951 CTTTTTTCTC GTGCCAACGC GGGTACAAGG GAGTCTGGCG GGGAGACGGC 

-** 3001 ATCATGCAAA CCACCTGCCC ATGTGGAGCA CAGATCACCG GACATGTCAA 

3051 AAACGGTTCC ATGAGGATCG TCGGGCCTAA GACCTGCAGC AACACGTGGC 

3101 ATGGAACATT CCCCATCAAC GCATACACCA CGGGCCCCTG CACACCCTCT 

3151 CCAGCGCCAA ACTATTCTAG GGCGCTGTGG CGGGTGGCCG CTGAGGAGTA 

3201 CGTGGAGGTC ACGCGGGTGG GGGATTTCCA CTACGTGACG GGCATGACCA 

3251 CTGACAACGT AAAGTGCCCA TGCCAGGTTC CGGCTCCTGA ATTCTTCACG 

3301 GAGGTGGACG GAGTGCGGTT GCACAGGTAC GCTCCGGCGT GCAGGCCTCT 

3351 CCTACGGGAG GAGGTTACAT TCCAGGTCGG GCTCAACCAA TACCTGGTTG 
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1 GCCACCATGG CCCCCATCAC CGCCTACAGC CAGCAGACCC GCGGCCTGCT 

51 GGGCTGCATC ATCACCAGCC TGACCGGCCG CGACAAGAAC CAGGTGGAGG 

101 GCGAGGTGCA GGTGGTGAGC ACCGCCACCC AGAGCTTCCT GGCCACCTGC 

151 GTGAACGGCG TGTGCTGGAC CGTGTACCAC GGCGCCGGCA GCAAGACCCT 

201 GGCCGGCCCC AAGGGCCCCA TCACCCAGAT GTACACCAAC GTGGACCAGG 

251 ACCTGGTGGG CTGGCAGGCC CCCCGCGGCG CCCGCAGCCT GACCCCCTGC 

301 ACCTGCGGCA GCAGCGACCT GTACCTGGTG ACCCGCCACG CCGACGTGAT 

351 CCCGGTGCGC CGCCGCGGCG ACAGCCGCGG CAGGCTGCTG AGCCCCCGCC 

401 CCGTGAGCTA CCTGAAGGGC AGCAGCGGCG GCCGGCTGCT GTGCCCCAGC 

451 GGCGACGCCG TGGGCATCTT CCGCGGCGCC GTGTGCACGC GCGGGGTGGC 

501 CAAGGCCGTG GACTTCGTGG GCGTGGAGAG CATGGAGACC ACCATGCGCA 

551 GCCCCGTGTT CACCGACAAC AGCAGCCGCC CCGGCGTGCC CCAGAGCTTC 

, 601 CAGGTGGCCC ACCTGCACGG CCCCAGCGGC AGCGGCAAGA GCACCAAGGT 

651 GCCCGCCGCG TACGCCGCCC AGGGCTACAA GGTGCTGGTG CTGAACCCCA 

701 GCGTGGCCGC CAGCCTGGGC TTCGGCGCCT ACATGAGCAA GGCCCACGGC 

751 ATCGACCCCA ACATCCGCAC CGGCGTGGGG ACCATCACCA CCGGCGCCCC 

801 CGTGACCTAC AGCACCTACG GCAAGTTCCT GGCCGACGGC GGCTGCAGCG 

851 GCGGCGCCTA CGACATCATC ATCTGCGACG AGTGCCACAG CACCGACAGC 

901 ACCACCATCC TGGGCATCGG CACCGTGCTG GACCAGGCCG AGACCGCCGG 

951 CGCCCGCCTG GTGGTGCTGG CCACCGCCAC CCCCCCCGGC AGCGTGACCG 

1001 TGCCCCACCC CAACATCGAG GAGGTGGCCC TGAGCAACAC CGGCGAGATC 

1051 CCCTTCTACG GCAAGGCCAT CCCCATCGAG GCCATCCGCG GCGGCCGCCA 

1101 CCTGATCTTC TGCCACAGCA AGAAGAAGTG CGACGAGCTG GCCGCCAAGC 

1151 TGAGCGGCCT GGGCATCAAC GCCGTGGCCT ACTACCGCGG CCTGGACGTG 

1201 AGCGTGATCC CCACCATCGG CGACGTGGTG GTGGTGGCCA CCGACGCCCT 

1251 GATGACCGGC TACACCGGCG ACTTCGACAG CGTGATCGAC TGCAACACCT 

1301 GCGTGACCCA GACCGTGGAC TTCAGCCTGG ACCCCACCTT CACCATCGAG 

1351 ACCACCACCG TGCCCCAGGA CGCCGTGAGC CGCAGCCAGC GCCGCGGCCG 

1401 GACCGGCCGC GGCCGCCGCG GCATCTACCG CTTCGTGACC CCCGGCGAGC 

1451 GCCCCAGCGG CATGTTCGAC AGCAGCGTGC TGTGCGAGTG GTACGACGCC 

1501 GGCTGCGCCT GGTACGAGCT GACCCCCGCC GAGACCAGCG TGCGCCTGCG 

1551 GGCCTACCTG AACACCCGCG GCCTGCCCGT GTGCCAGGAC CACCTGGAGT 

1601 TCTGGGAGAG GGTGTTCACG GGCCTGACCC ACATCGACGG CCACTTCCTG 

1651 AGCGAGAGCA AGCAGGCGGG CGACAACTTC CCCTACCTGG TGGCCTACCA 
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1701 GGCCACCGTG TGCGCCCGCG CCCAGGCCCC CCCCCCCAGG TGGGACCAGA 

1751 TGTGGAAGTG CCTGATCCGC CTGAAGCCCA CCCTGCACGG CCCCACCCCC 

1801 - CTGCTGTACC GCCTGGGCGC CGTGCAGAAC GAGGTGACCC TGACCCACCC 

1851 CATCACCAAG TACATCATGG CCTGCATGAG CGCCGACCTG GAGGTGGTGA 

1901 CCAGCACCTG GGTGCTGGTG GGCGGCGTGC TGGCCGCCCT GGCCGCCTAC 

1951 TGCCTGACCA CCGGCAGCGT GGTGATCGTG GGCCGCATCA TCCTGAGCGG 

2001 CCGCCCCGCC ATCGTGCCCG ACCGCGAGTT CCTGTACCAG GAGTTCGACG 

2051 AGATGGAGGA GTGCGCGAGC CACCTGCCCT ACATCGAGCA GGGCATGC AG. 

2101 CTGGCCGAGC AGTTCAAGCA GAAGGCCCTG GGCCTGCTGC AGACCGGCAC 

2151 CAAGCAGGCC GAGGCCGCCG CCCCCGTGGT GGAGAGCAAG TGGCGCGCCC 

2201 TGGAGACCTT CTGGGCCAAG CACATGTGGA ACTTCATCAG CGGCATCCAG 

2251 TACCTGGCCG GCCTGAGCAC CCTGCCCGGC AACCCCGCCA TCGCCAGCCT 

2301 GATGGCCTTC ACCGCCAGCA TCACCAGCCC CCTGACCACC CAGAGCACCC 

2351 TGCTGTTCAA CATCCTGGGC GGCTGGGTGG CCGCCCAGCT GGCCCCCCCC 

2401 AGCGCCGCCA GCGCCTTCGT GGGCGCCGGC ATCGCCGGCG CCGCCGTGGG 

2451 CAGCATCGGC CTGGGCAAGG TGCTGGTGGA CATCCTGGCC GGCTACGGCG 

2501 CCGGCGTGGC CGGCGCCCTG GTGGCCTTCA AGGTGATGAG CGGCGAGATG 

2551 CCCAGCACCG AGGACCTGGT GAACCTGCTG CCCGCCATCC TGAGCCCCGG 

2601 CGCCCTGGTG GTGGGCGTGG TGTGCGCCGC CATCCTGCGC CGCCACGTGG 

2651 GCCCCGGCGA GGGCGCCGTG CAGTGGATGA ACCGCCTGAT CGCCTTCGCC 

2701 AGCCGCGGCA ACCACGTGAG CCCCACCCAC TACGTGCCCG AGAGCGACGC 

2751 CGCCGCCCGC GTGACCCAGA TCCTGAGCAG CCTGACCATC ACCCAGCTGC 

2801 TGAAGCGCCT GCACCAGTGG ATCAACGAGG ACTGCAGCAC CCCCTGCAGC 

2851 GGCAGCTGGC TGCGCGACGT GTGGGACTGG ATCTGCACCG TGCTGACCGA 

2901 CTTCAAGACC TGGCTGCAGA GCAAGCTGCT GCCCCAGCTG CCCGGCGTGC 

2951 CCTTCTTCAG CTGCCAGCGC GGCTACAAGG GCGTGTGGCG CGGCGACGGC 

^ 3001 ATCATGCAGA CCACCTGCCC CTGGGGCGCC CAGATCACCG GCCACGTGAA 
3051 . GAACGGCAGC ATGCGCATCG TGGGCCCCAA GACCTGCAGC AACACCTGGC 

3101 ACGGCACCTT CCCCATCAAC GCCTACACCA CCGGCCCCTG CACCCCCAGC 
3151 - CCCGCCCCCA ACTACAGCCG CGCCCTGTGG CGCGTGGCCG CCGAGGAGTA 

3201 CGTGGAGGTG ACCCGCGTGG GCGACTTCCA CTACGTGACC GGCATGACCA 

3251 CCGACAACGT GAAGTGCCCC TGCCAGGTGC CCGCCCCCGA GTTCTTCACC 

3301 GAGGTGGACG GCGTGCGCCT GCACCGCTAC GCCCCCGCCT GCCGCCCCCT 

3351 GCTGCGCGAG GAGGTGACCT TCCAGGTGGG CCTGAACCAG TACCTGGTGG 
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1 catcatcaat aatatacctt attttggatt gaagccaata tgataatgag ggggtggagt ; 
61 ttgtgacgtg gcgcggggcg tgggaacggg gcgggtgacg tagtagtgtg gcggaagtgt 
121 gatgttgcaa gtgtggcgga acacatgtaa gcgacggatg tggcaaaagt gacgtttttg 
181 gtgtgcgccg gtgtacacag gaagtgacaa ttttcgcgcg gttttaggcg gatgttgtag 
241 taaatttggg cgtaaccgag taagatttgg ccattttcgc gggaaaactg aataagagga 
301 agtgaaatct gaataatttt gtgttactca tagcgcgtaa tatttgtcta gggccgcggg 
361 gactttgacc gtttacgtgg agactcgccc aggtgttttt ctcaggtgtt ttccgcgttc 
421 cgggtcaaag ttggcgtttt attattatag gcggccgcga tccattgcat acgttgtatc 
481 catatcataa tatgtacatt tatattggct catgtccaac attaccgcca tgttgacatt 
541 gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 
601 tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 
661 cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 
721 attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 
7 81 atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 
841 atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 
901 tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 
9 61 actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 
1021 aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 
1081 gtaggcgtgt acggtgggag gtctatataa gcagagctcg tttagtgaac cgtcagatcg 
1141 cctggagacg ccatccacgc tgttttgacc tccatagaag acaccgggac cgatccagcc 
1201 tccgcggccg ggaacggtgc attggaacgc ggattccccg tgccaagagt gagatctgcc 

12 61 accatggcgc ccatcacggc ctactcccaa cagacgcggg gcctacttgg ttgcatcatc 

13 21 actagcctta caggccggga caagaaccag gtcgagggag aggttcaggt ggtttccacc 
13 81 gcaacacaat ccttcctggc gacctgcgtc aacggcgtgt gttggaccgt ttaccatggt 
1441 gctggctcaa agaccttagc cggcccaaag gggccaatca cccagatgta cactaatgtg 
1501 gaccaggacc tcgtcggctg gcaggcgccc cccggggcgc gttccttgac accatgcacc 
15 61 tgtggcagct cagaccttta cttggtcacg agacatgctg acgtcattcc ggtgcgccgg 
1621 cggggcgaca gtagggggag cctgctctcc cccaggcctg tctcctactt gaagggctct 
1681 tcgggtggtc cactgctctg cccttcgggg cacgctgtgg gcatcttccg ggctgccgta 
1741 tgcacccggg gggttgcgaa ggcggtggac tttgtgcccg tagagtccat ggaaactact 
1801 atgcggtctc cggtcttcac ggacaactca tcccccccgg ccgtaccgca gtcatttcaa 
1861 gtggcccacc tacacgctcc cactggcagc ggcaagagta ctaaagtgcc ggctgcatat 
1921 gcagcccaag ggtacaaggt gctcgtcctc aatccgtccg ttgccgctac cttagggttt 
1981 ggggcgtata tgtctaaggc acacggtatt gaccccaaca tcagaactgg ggtaaggacc 
2041 attaccacag gcgcccccgt cacatactct acctatggca agtttcttgc cgatggtggt 
2101 tgctctgggg gcgcttatga catcataata tgtgatgagt gccattcaac tgactcgact 
2161 acaatcttgg gcatcggcac agtcctggac caagcggaga cggctggagc gcggcttgtc 
2221 gtgctcgcca ccgctacgcc tccgggatcg gtcaccgtgc cacacccaaa catcgaggag 
2281 gtggccctgt ctaatactgg agagatcccc ttctatggca aagccatccc cattgaagcc 
2341 atcagggggg gaaggcatct cattttctgt cattccaaga agaagtgcga cgagctcgcc 
2401 gcaaagctgt caggcctcgg aatcaacgct gtggcgtatt accgggggct cgatgtgtcc 
2461 gtcataccaa ctatcggaga cgtcgttgtc gtggcaacag acgctctgat gacgggctat 
2521 acgggcgact ttgactcagt gatcgactgt aacacatgtg tcacccagac agtcgacttc 
2581 agcttggatc ccaccttcac cattgagacg acgaccgtgc ctcaagacgc agtgtcgcgc 
2641 tcgcagcggc ggggtaggac tggcaggggt aggagaggca tctacaggtt tgtgactccg 
2701 ggagaacggc cctcgggcat gttcgattcc tcggtcctgt gtgagtgcta tgacgcgggc 
2761 tgtgcttggt acgagctcac ccccgccgag acctcggtta ggttgcgggc ctacctgaac 
2821 acaccagggt tgcccgtttg ccaggaccac ctggagttct gggagagtgt cttcacaggc 
2881 ctcacccaca tagatgcaca cttcttgtcc cagaccaagc aggcaggaga caacttcccc 
2941 tacctggtag cataccaagc cacggtgtgc gccagggctc aggccccacc tccatcatgg 
3001 gatcaaatgt ggaagtgtct catacggctg aaacctacgc tgcacgggcc aacacccttg 
3061 ctgtacaggc tgggagccgt ccaaaatgag gtcaccctca cccaccccat aaccaaatac 
3121 atcatggcat gcatgtcggc tgacctggag gtcgtcacta gcacctgggt gctggtgggc 
3181 ggagtccttg cagctctggc cgcgtattgc ctgacaacag gcagtgtggt cattgtgggt 
3241 aggattatct tgtccgggag gccggctatt gttcccgaca gggagtttct ctaccaggag 
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3361 gccgagcaat tcaagcagaa agcgctcggg ttactgcaaa cagccaccaa acaagcggag 
3421 gctgctgctc ccgtggtgga gtccaagtgg cgagcccttg agacattctg ggcgaagcac 
3481 atgtggaatt tcatcagcgg gatacagtac ttagcaggct tatccactct gcctgggaac 
354i cccgcaatag catcattgat ggcattcaca gcctctatca ccagcccgct caccacccaa 
3601 agtaccctcc tgtttaacat cttggggggg tgggtggctg cccaactcgc cccccccagc 
3661 gccgcttcgg ctttcgtggg cgccggcatc gccggtgcgg ctgttggcag cataggcctt 
3721 gggaaggtgc ttgtggacat tctggcgggt tatggagcag gagtggccgg cgcgctcgtg 
3781 gccttcaagg tcatgagcgg cgagatgccc tccaccgagg acctggtcaa tctacttcct 
3841 gccatcctct ctcctggcgc cctggtcgtc ggggtcgtgt gtgcagcaat actgcgtcga 
3901 cacgtgggtc cgggagaggg ggctgtgcag tggatgaacc ggctgatagc gttcgcctcg 
3961 cggggtaatc atgtttcccc cacgcactat gtgcctgaga gcgacgccgc agcgcgtgtt 
4021 actcagatcc tctccagcct taccatcact cagctgctga aaaggctcca ccagtggatt 
4081 aatgaagact gctccacacc gtgttccggc tcgtggctaa gggatgtttg ggactggata 
4141 tgcacggtgt tgactgactt caagacctgg ctccagtcca agctcctgcc gcagctaccg 
4201 ggagtccctt ttttctcgtg ccaacgcggg tacaagggag tctggcgggg agacggcatc 
4261 atgcaaacca cctgcccatg tggagcacag atcaccggac atgtcaaaaa cggttccatg 
4321 aggatcgtcg ggcctaagac ctgcagcaac acgtggcatg gaacattccc catcaacgca 
4381 tacaccacgg gcccctgcac accctctcca gcgccaaact attctagggc gctgtggcgg 
4441 gtggccgctg aggagtacgt ggaggtcacg cgggtggggg atttccacta cgtgacgggc 
4501 atgaccactg acaacgtaaa gtgcccatgc caggttccgg ctcctgaatt cttcacggag 
4561 gtggacggag tgcggttgca caggtacgct ccggcgtgca ggcctctcct acgggaggag 
4621 gttacattcc aggtcgggct caaccaatac ctggttgggt cacagctacc atgcgagccc 
4681 gaaccggatg tagcagtgct cacttccatg ctcaccgacc cctcccacat cacagcagaa 
4741 acggctaagc gtaggttggc cagggggtct cccccctcct tggccagctc ttcagctagc 
4801 cagttgtctg cgccttcctt gaaggcgaca tgcactaccc accatgtctc tccggacgct 
4861 gacctcatcg aggccaacct cctgtggcgg caggagatgg gcgggaacat cacccgcgtg 
4921 gagtcggaga acaaggtggt agtcctggac tctttcgacc cgcttcgagc ggaggaggat 
4981 gagagggaag tatccgttcc ggcggagatc ctgcggaaat ccaagaagtt ccccgcagcg 
5041 atgcccatct gggcgcgccc ggattacaac cctccactgt tagagtcctg gaaggacccg 
5101 gactacgtcc ctccggtggt gcacgggtgc ccgttgccac ctatcaaggc ccctccaata 
5161 ccacctccac ggagaaagag gacggttgtc ctaacagagt cctccgtgtc ttctgcctta 
5221 gcggagctcg ctactaagac cttcggcagc tccgaatcat cggccgtcga cagcggcacg 
5281 gcgaccgccc ttcctgacca ggcctccgac gacggtgaca aaggatccga cgttgagtcg 
5341 tactcctcca tgccccccct tgagggggaa ccgggggacc ccgatctcag tgacgggtct 
5401 tggtctaccg tgagcgagga agctagtgag gatgtcgtct gctgctcaat gtcctacaca 
5461 tggacaggcg ccttgatcac gccatgcgct gcggaggaaa gcaagctgcc catcaacgcg 
5521 ttgagcaact ctttgctgcg ccaccataac atggtttatg ccacaacatc tcgcagcgca 
5581 ggcctgcggc agaagaaggt cacctttgac agactgcaag tcctggacga ccactaccgg 
5641 gacgtgctca aggagatgaa ggcgaaggcg tccacagtta aggctaaact cctatccgta 
5701 gaggaagcct gcaagctgac gcccccacat tcggccaaat ccaagtttgg ctatggggca 
5761 aaggacgtcc ggaacctatc cagcaaggcc gttaaccaca tccactccgt gtggaaggac 
5821 ttgctggaag acactgtgac accaattgac accaccatca tggcaaaaaa tgaggttttc 
5881 tgtgtccaac cagagaaagg aggccgtaag ccagcccgcc ttatcgtatt cccagatctg 
5941 ggagtccgtg tatgcgagaa gatggccctc tatgatgtgg tctccaccct tcctcaggtc 
6001 gtgatgggct cctcatacgg attccagtac tctcctgggc agcgagtcga gttcctggtg 
6061 aatacctgga aatcaaagaa aaaccccatg ggcttttcat atgacactcg ctgtttcgac 
6121 tcaacggtca ccgagaacga catccgtgtt gaggagtcaa tttaccaatg ttgtgacttg 
6181 gcccccgaag ccagacaggc cataaaatcg ctcacagagc ggctttatat cgggggtcct 
6241 ctgactaatt caaaagggca gaactgcggt tatcgccggt gccgcgcgag cggcgtgctg 
6301 acgactagct gcggtaacac cctcacatgt tacttgaagg cctctgcagc ctgtcgagct 
6361 gcgaagctcc aggactgcac gatgctcgtg aacgccgccg gccttgtcgt tatctgtgaa 
6421 agcgcgggaa cccaagagga cgcggcgagc ctacgagtct tcacggaggc tatgactagg 
6481 tactctgccc cccccgggga cccgccccaa ccagaatacg acttggagct gataacatca 
6541 tgttcctcca atgtgtcggt cgcccacgat gcatcaggca aaagggtgta ctacctcacc 
6601 cgtgatccca ccacccccct cgcacgggct gcgtgggaaa cagctagaca cactccagtt 
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6661 aactcctggc taggcaacat tatcatgtat gcgcccactt tgtgggcaag gatgattctg 
6721 atgactcact tcttctccat ccttctagca caggagcaac ttgaaaaagc cctggactgc 
6781 cagatctacg gggcctgtta ctccattgag ccacttgacc tacctcagat cattgaacga 
6841 ctccatggcc ttagcgcatt ttcactccat agttactctc caggtgagat caatagggtg 
6901 gcttcatgcc tcaggaaact tggggtacca cccttgcgag tctggagaca tcgggccagg 
6961 agcgtccgcg ctaggctact gtcccagggg gggagggccg ccacttgtgg caagtacctc 
7021 ttcaactggg cagtgaagac caaactcaaa ctcactccaa tcccggctgc gtcccagctg 
7081 gacttgtccg gctggttcgt tgctggttac agcgggggag acatatatca cagcctgtct 
7141 cgtgcccgac cccgctggtt catgctgtgc ctactcctac tttctgtagg ggtaggcatc 
7201 tacctgctcc ccaaccggta aatctagagc tgtgccttct agttgccagc catctgttgt 
7261 ttgcccctcc cccgtgcctt ccttgaccct ggaaggtgcc actcccactg tcctttccta 
7321 ataaaatgag gaaattgcat cgcattgtct gagtaggtgt cattctattc tggggggtgg 
73 81 ggtggggcag gacagcaagg gggaggattg ggaagacaat agcaggcatg ctggggatgc 
7441 ggtgggctct atggccgatc ggcgcgccgt actgaaatgt gtgggcgtgg cttaagggtg 
7501 ggaaagaata tataaggtgg gggtcttatg tagttttgta tctgttttgc agcagccgcc 
7561 gccgccatga gcaccaactc gtttgatgga agcattgtga gctcatattt gacaacgcgc 
7621 atgcccccat gggccggggt gcgtcagaat gtgatgggct ccagcattga tggtcgcccc 
7681 gtcctgcccg caaactctac taccttgacc tacgagaccg tgtctggaac gccgttggag 
7741 actgcagcct ccgccgccgc ttcagccgct gcagccaccg cccgcgggat tgtgactgac 
7801 tttgctttcc tgagcccgct tgcaagcagt gcagcttccc gttcatccgc ccgcgatgac 
7861 aagttgacgg ctcttttggc acaattggat tctttgaccc gggaacttaa tgtcgtttct 
7921 cagcagctgt tggatctgcg ccagcaggtt tctgccctga aggcttcctc ccctcccaat 
7981 gcggtttaaa acataaataa aaaaccagac tctgtttgga tttggatcaa gcaagtgtct 
8041 tgctgtcttt atttaggggt tttgcgcgcg cggtaggccc gggaccagcg gtctcggtcg 
8101 ttgagggtcc tgtgtatttt ttccaggacg tggtaaaggt gactctggat gttcagatac 
8161 atgggcataa gcccgtctct ggggtggagg tagcaccact gcagagcttc atgctgcggg 
8221 gtggtgttgt agatgatcca gtcgtagcag gagcgctggg cgtggtgcct aaaaatgtct 
8281 ttcagtagca agctgattgc caggggcagg cccttggtgt aagtgtttac aaagcggtta 
8341 agctgggatg ggtgcatacg tggggatatg agatgcatct tggactgtat ttttaggttg 
8401 gctatgttcc cagccatatc cctccgggga ttcatgttgt gcagaaccac cagcacagtg 
8461 tatccggtgc acttgggaaa tttgtcatgt agcttagaag gaaatgcgtg gaagaacttg 
8521 gagacgccct tgtgacctcc aagattttcc atgcattcgt ccataatgat ggcaatgggc 
8581 ccacgggcgg cggcctgggc gaagatattt ctgggatcac taacgtcata gttgtgttcc 
8641 aggatgagat cgtcataggc catttttaca aagcgcgggc ggagggtgcc agactgcggt 
8701 ataatggttc catccggccc aggggcgtag ttaccctcac agatttgcat ttcccacgct 
87 61 ttgagttcag atggggggat catgtctacc tgcggggcga tgaagaaaac ggtttccggg 
8821 gtaggggaga tcagctggga agaaagcagg ttcctgagca gctgcgactt accgcagccg 
8881 gtgggcccgt aaatcacacc tattaccggc tgcaactggt agttaagaga gctgcagctg 
8941 ccgtcatccc tgagcagggg ggccacttcg ttaagcatgt ccctgactcg catgttttcc 
9001 ctgaccaaat ccgccagaag gcgctcgccg cccagcgata gcagttcttg caaggaagca 
9061 aagtttttca acggtttgag accgtccgcc gtaggcatgc ttttgagcgt ttgaccaagc 
-*» 9121 agttccaggc ggtcccacag ctcggtcacc tgctctacgg catctcgatc cagcatatct 
9181 cctcgtttcg cgggttgggg cggctttcgc tgtacggcag tagtcggtgc tcgtccagac 
9241 gggccagggt catgtctttc cacgggcgca gggtcctcgt cagcgtagtc tgggtcacgg 
93 01 tgaaggggtg cgctccgggc tgcgcgctgg ccagggtgcg cttgaggctg gtcctgctgg 
9361 tgctgaagcg ctgccggtct tcgccctgcg cgtcggccag gtagcatttg accatggtgt 
9421 catagtccag cccctccgcg gcgtggccct tggcgcgcag cttgcccttg gaggaggcgc 
9481 cgcacgaggg gcagtgcaga cttttgaggg cgtagagctt gggcgcgaga aataccgatt 
9541 ccggggagta ggcatccgcg ccgcaggccc cgcagacggt ctcgcattcc acgagccagg 
9601 tgagctctgg ccgttcgggg tcaaaaacca ggtttccccc atgctttttg atgcgtttct 
9661 tacctctggt ttccatgagc cggtgtccac gctcggtgac gaaaaggctg tccgtgtccc 
9721 cgtatacaga cttgagaggc ctgtcctcga gcggtgttcc gcggtcctcc tcgtatagaa 
9781 actcggacca ctctgagacg aaggctcgcg tccaggccag cacgaaggag gctaagtggg 
9841 aggggtagcg gtcgttgtcc actagggggt ccactcgctc cagggtgtga agacacatgt 
9901 cgccctcttc ggcatcaagg aaggtgattg gtttataggt gtaggccacg tgaccgggtg 
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9961 ttcctgaagg ggggctataa aagggggtgg gggcgcgttc gtcctcactc tcttccgcat 
10021 cgctgtctgc gagggccagc tgttggggtg agtactccct ctcaaaagcg ggcatgactt 
10081 ctgcgctaag attgtcagtt tccaaaaacg aggaggattt gatattcacc tggcccgcgg 
10141 tgatgccttt gagggtggcc gcgtccatct ggtcagaaaa gacaatcttt ttgttgtcaa 
10201 gcttggtggc aaacgacccg tagagggcgt tggacagcaa cttggcgatg gagcgcaggg 
10261 tttggttttt gtcgcgatcg gcgcgctcct tggccgcgat gtttagctgc acgtattcgc 
10321 gcgcaacgca ccgccattcg ggaaagacgg tggtgcgctc gtcgggcact aggtgcacgc 
10381 gccaaccgcg gttgtgcagg gtgacaaggt caacgctggt ggctacctct ccgcgtaggc 
10441 gctcgttggt ccagcagagg cggccgccct tgcgcgagca gaatggcggt agtgggtcta 
10501 gctgcgtctc gtccgggggg tctgcgtcca cggtaaagac cccgggcagc aggcgcgcgt 
10561 cgaagtagtc tatcttgcat ccttgcaagt ctagcgcctg ctgccatgcg cgggcggcaa 
10621 gcgcgcgctc gtatgggttg agtgggggac cccatggcat ggggtgggtg agcgcggagg 
10681 cgtacatgcc gcaaatgtcg taaacgtaga ggggctctct gagtattcca agatatgtag 
10741 ggtagcatct tccaccgcgg atgctggcgc gcacgtaatc gtatagttcg tgcgagggag 
10801 cgaggaggtc gggaccgagg ttgctacggg cgggctgctc tgctcggaag actatctgcc 
10861 tgaagatggc atgtgagttg gatgatatgg ttggacgctg gaagacgttg aagctggcgt 
10921 ctgtgagacc taccgcgtca cgcacgaagg aggcgtagga gtcgcgcagc ttgttgacca 
10981 gctcggcggt gacctgcacg tctagggcgc agtagtccag ggtttccttg atgatgtcat 
11041 acttatcctg tccctttttt ttccacagct cgcggttgag gacaaactct tcgcggtctt 
11101 tccagtactc ttggatcgga aacccgtcgg cctccgaacg gtaagagcct agcatgtaga 
11161 actggttgac ggcctggtag gcgcagcatc ccttttctac gggtagcgcg tatgcctgcg 
11221 cggccttccg gagcgaggtg tgggtgagcg caaaggtgtc cctaaccatg actttgaggt 
11281 actggtattt gaagtcagtg tcgtcgcatc cgccctgctc ccagagcaaa aagtccgtgc 
11341 gctttttgga acgcgggttt ggcagggcga aggtgacatc gttgaagagt atctttcccg 
11401 cgcgaggcat aaagttgcgt gtgatgcgga agggtcccgg cacctcggaa cggttgttaa 
11461 ttacctgggc ggcgagcacg atctcgtcaa agccgttgat gttgtggccc acaatgtaaa 
11521 gttccaagaa gcgcgggatg cccttgatgg aaggcaattt tttaagttcc tcgtaggtga 
11581 gctcttcagg ggagctgagc ccgtgctctg aaagggccca gtctgcaaga tgagggttgg 
11641 aagcgacgaa tgagctccac aggtcacggg ccattagcat ttgcaggtgg tcgcgaaagg 
11701 tcctaaactg gcgacctatg gccatttttt ctggggtgat gcagtagaag gtaagcgggt 
11761 cttgttccca gcggtcccat ccaaggtccg cggctaggtc tcgcgcggcg gtcactagag 
11821 gctcatctcc gccgaacttc atgaccagca tgaagggcac g'agctgcttc ccaaaggccc 
11881 ccatccaagt ataggtctct acatcgtagg tgacaaagag acgctcggtg cgaggatgcg 
11941 agccgatcgg gaagaactgg atctcccgcc accagttgga ggagtggctg ttgatgtggt 
12001 gaaagtagaa gtccctgcga cgggccgaac actcgtgctg gcttttgtaa aaacgtgcgc 
12061 agtactggca gcggtgcacg ggctgtacat cctgcacgag gttgacctga cgaccgcgca 
12121 caaggaagca gagtgggaat ttgagcccct cgcctggcgg gtttggctgg tggtcttcta 
12181 cttcggctgc ttgtccttga ccgtctggct gctcgagggg agttacggtg gatcggacca 
12241 ccacgccgcg cgagcccaaa gtccagatgt ccgcgcgcgg cggtcggagc ttgatgacaa 
12301 catcgcgcag atgggagctg tccatggtct ggagctcccg cggcgtcagg tcaggcggga 
12361 gctcctgcag gtttacctcg catagccggg tcagggcgcg ggctaggtcc aggtgatacc 
12421 tgatttccag gggctggttg gtggcggcgt cgatggcttg caagaggccg catccccgcg 
12481 gcgcgactac ggtaccgcgc ggcgggcggt gggccgcggg ggtgtccttg gatgatgcat 
12541 ctaaaagcgg tgacgcgggc gggcccccgg aggtaggggg ggctcgggac ccgccgggag 
12601 agggggcagg ggcacgtcgg cgccgcgcgc gggcaggagc tggtgctgcg cgcggaggtt 
12661 gctggcgaac gcgacgacgc ggcggttgat ctcctgaatc tggcgcctct gcgtgaagac 
12721 gacgggcccg gtgagcttga acctgaaaga gagttcgaca gaatcaattt cggtgtcgtt 
12781 gacggcggcc tggcgcaaaa tctcctgcac gtctcctgag ttgtcttgat aggcgatctc 
12841 ggccatgaac tgctcgatct cttcctcctg gagatctccg cgtccggctc gctccacggt 
12901 ggcggcgagg tcgttggaga tgcgggccat gagctgcgag aaggcgttga ggcctccctc 
12961 gttccagacg cggctgtaga ccacgccccc ttcggcatcg cgggcgcgca tgaccacctg 
13021 cgcgagattg agctccacgt gccgggcgaa gacggcgtag tttcgcaggc gctgaaagag 
13081 gtagttgagg gtggtggcgg tgtgttctgc cacgaagaag tacataaccc agcgccgcaa 
13141 cgtggattcg ttgatatccc ccaaggcctc aaggcgctcc atggcctcgt agaagtccac 
13201 ggcgaagttg aaaaactggg agttgcgcgc cgacacggtt aactcctcct ccagaagacg 
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13261 gatgagctcg gcgacagtgt cgcgcacctc gcgctcaaag gctacagggg cctcttcttc 
13321 ttcttcaatc tcctcttcca taagggcctc cccttcttct tcttctggcg gcggtggggg 
13381 aggggggaca cggcggcgac gacggcgcac cgggaggcgg tcgacaaagc gctcgatcat 
13441 ctccccgcgg cgacggcgca tggtctcggt gacggcgcgg ccgttctcgc gggggcgcag 
13501 ttggaagacg ccgcccgtca tgtcccggtt atgggttggc ggggggctgc cgtgcggcag 
13561 ggatacggcg ctaacgatgc atctcaacaa ttgttgtgta ggtactccgc caccgaggga 
13621 cctgagcgag tccgcatcga ccggatcgga aaacctctcg agaaaggcgt ctaaccagtc 
13 681 acagtcgcaa ggtaggctga gcaccgtggc gggcggcagc gggcggcggt cggggttgtt 
13741 tctggcggag gtgctgctga tgatgtaatt aaagtaggcg gtcttgagac ggcggatggt 
13801 cgacagaagc accatgtcct tgggtccggc ctgctgaatg cgcaggcggt cggccatgcc 
13861 ccaggcttcg ttttgacatc ggcgcaggtc tttgtagtag tcttgcatga gcctttctac 
13 921 cggcacttct tcttctcctt cctcttgtcc tgcatctctt gcatctatcg ctgcggcggc 
13981 ggcggagttt ggccgtaggt ggcgccctct tcctcccatg cgtgtgaccc cgaagcccct 
14041 catcggctga agcagggcca ggtcggcgac aacgcgctcg gctaatatgg cctgctgcac 
14101 ctgcgtgagg gtagactgga agtcgtccat gtccacaaag cggtggtatg cgcccgtgtt 
14161 gatggtgtaa gtgcagttgg ccataacgga ccagttaacg gtctggtgac ccggctgcga 
14221 gagctcggtg tacctgagac gcgagtaagc ccttgagtca aagacgtagt cgttgcaagt 
14281 ccgcaccagg tactggtatc ccaccaaaaa gtgcggcggc ggctggcggt agaggggcca 
14341 gcgtagggtg gccggggctc cgggggcgag gtcttccaac ataaggcgat gatatccgta 
14401 gatgtacctg gacatccagg tgatgccggc ggcggtggtg gaggcgcgcg gaaagtcacg 
14461 gacgcggttc cagatgttgc gcagcggcaa aaagtgctcc atggtcggga cgctctggcc 
14521 ggtcaggcgc gcgcagtcgt tgacgctcta gaccgtgcaa aaggagagcc tgtaagcggg 
14581 cactcttccg tggtctggtg gataaattcg caagggtatc atggcggacg accggggttc 
14641 gaaccccgga tccggccgtc cgccgtgatc catgcggtta ccgcccgcgt gtcgaaccca 
14701 ggtgtgcgac gtcagacaac gggggagcgc tccttttggc ttccttccag gcgcggcgga 
14761 tgctgcgcta gcttttttgg ccactggccg cgcgcggcgt aagcggttag gctggaaagc 
14821 gaaagcatta agtggctcgc tccctgtagc cggagggtta ttttccaagg gttgagtcgc 
14881 gggacccccg gttcgagtct cgggccggcc ggactgcggc gaacgggggt ttgcctcccc 
14941 gtcatgcaag accccgcttg caaattcctc cggaaacagg gacgagcccc ttttttgctt 
15001 ttcccagatg catccggtgc tgcggcagat gcgcccccct cctcagcagc ggcaagagca 
15061 agagcagcgg cagacatgca gggcaccctc cccttctcct accgcgtcag gaggggcaac 
15121 atccgcggct gacgcggcgg cagatggtga ttacgaaccc ccgcggcgcc ggacccggca 
15181 ctacttggac ttggaggagg gcgagggcct ggcgcggcta ggagcgccct ctcctgagcg 
15241 acacccaagg gtgcagctga agcgtgacac gcgcgaggcg tacgtgccgc ggcagaacct 
15301 gtttcgcgac cgcgagggag aggagcccga ggagatgcgg gatcgaaagt tccatgcagg 
15361 gcgcgagttg cggcatggcc tgaaccgcga gcggttgctg cgcgaggagg actttgagcc 
15421 cgacgcgcgg accgggatta gtcccgcgcg cgcacacgtg gcggccgccg acctggtaac 
15481 cgcgtacgag cagacggtga accaggagat taactttcaa aaaagcttta acaaccacgt 
15541 gcgcacgctt gtggcgcgcg aggaggtggc tataggactg atgcatctgt gggactttgt 
15601 aagcgcgctg gagcaaaacc caaatagcaa gccgctcatg gcgcagctgt tccttatagt 
15661 gcagcacagc agggacaacg aggcattcag ggatgcgctg ctaaacatag tagagcccga 
15721 gggccgctgg ctgctcgatt tgataaacat tctgcagagc atagtggtgc aggagcgcag 
15781 cttgagcctg gctgacaagg tggccgccat taactattcc atgctcagtc tgggcaagtt 
15841 ttacgcccgc aagatatacc atacccctta cgttcccata gacaaggagg taaagatcga 
15901 ggggttctac atgcgcatgg cgctgaaggt gcttaccttg agcgacgacc tgggcgttta 
15961 tcgcaacgag cgcatccaca aggccgtgag cgtgagccgg cggcgcgagc tcagcgaccg 
16021 cgagctgatg cacagcctgc aaagggccct ggctggcacg ggcagcggcg atagagaggc 
16081 cgagtcctac tttgacgcgg gcgctgacct gcgctgggcc ccaagccgac gcgccctgga 
16141 ggcagctggg gccggacctg ggctggcggt ggcacccgcg cgcgctggca acgtcggcgg 
16201 cgtggaggaa tatgacgagg acgatgagta cgagccagag gacggcgagt actaagcggt 
16261 gatgtttctg atcagatgat gcaagacgca acggacccgg cggtgcgggc ggcgctgcag 
16321 agccagccgt ccggccttaa ctccacggac gactggcgcc aggtcatgga ccgcatcatg 
16381 tcgctgactg cgcgcaaccc tgacgcgttc cggcagcagc cgcaggccaa ccggctctcc 
16441 gcaattctgg aagcggtggt cccggcgcgc gcaaacccca cgcacgagaa ggtgctggcg 
16501 atcgtaaacg cgctggccga aaacagggcc atccggcccg atgaggccgg cctggtctac 
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16561 gacgcgctgc ttcagcgcgt ggctcgttac 
16621 cggctggtgg gggatgtgcg cgaggccgtg 
16681 aacctgggct ccatggttgc actaaacgcc 
16741 cggggacagg aggactacac caactttgtg 
16801 ccgcaaagtg aggtgtatca gtccgggcca 
16861 ctgcagaccg taaacctgag ccaggctttc 
16921 gctcccacag gcgaccgcgc gaccgtgtct 
16981 ctgctgctaa tagcgccctt cacggacagt 
17041 cacttgctga cactgtaccg cgaggccata 
17101 caggagatta caagtgttag ccgcgcgctg 
17161 accctgaact acctgctgac caaccggcgg 
17221 agcgaggagg agcgcatttt gcgctatgtg 
17281 gacggggtaa cgcccagcgt ggcgctggac 
17341 tatgcctcaa accggccgtt tatcaatcgc 
17401 gtgaaccccg agtatttcac caatgccatc 
17461 ttctacaccg ggggattcga ggtgcccgag 
17521 gacgacagcg tgttttcccc gcaaccgcag 
17581 gcagaggcgg cgctgcgaaa ggaaagcttc 
17641 gctgcggccc cgcggtcaga tgctagtagc 
17701 agcactcgca ccacccgccc gcgcctgctg 
17761 ctgcagccgc agcgcgaaaa gaacctgcct 
17821 ctagtggaca agatgagtag atggaagacg 
178.81 ccgcgcccgc ccacccgtcg tcaaaggcac 
17941 gatgactcgg cagacgacag cagcgtcttg 
18001 caccttcgcc ccaggctggg gagaatgttt 
18061 tcaccaaggc catggcaccg agcgttggtt 
18121 ggcgatgtat gaggaaggtc ctcctccctc 
18181 ggcggcggcg ctgggttcac ccttcgatgc 
18241 gtacctgcgg cctaccgggg ggagaaacag 
18301 cgacaccacc cgtgtgtacc ttgtggacaa 
18361 ccagaacgac cacagcaact ttctaaccac 
18421 ggaggcaagc acacagacca tcaatcttga 
18481 aaccatcctg cataccaaca tgccaaatgt 
18541 ggcgcgggtg atggtgtcgc gctcgcttac 
18601 gtgggtggag ttcacgctgc ccgagggcaa 
18661 gaacaacgcg atcgtggagc actacttgaa 
18721 cgacatcggg gtaaagtttg acacccgcaa 
18781 tcttgtcatg cctggggtat atacaaacga 
18841 aggatgcggg gtggacttca cccacagccg 
18901 gcaacccttc caggagggct ttaggatcac 
18961 cgcactgttg gatgtggacg cctaccaggc 
19021 gggtggcgca ggcggcggca acaacagtgg 
19081 agctgcggca atgcagccgg tggaggacat 
19141 tgccacacgg gcggaggaga agcgcgctga 
19201 cgctgcggag gctgcacaac ccgaggtcga 
19261 cctgacagag gacagcaaga aacgcagtta 
19321 ccagtaccgc agctggtacc ttgcatacaa 
19381 atggaccctg ctttgcactc ctgacgtaac 
19441 gcccgacatg atgcaagacc ccgtgacctt 
19501 ggtggtgggc gccgagctgt tgcccgtgca 
19561 ctactcccag ctcatccgcc agtttacctc 
19621 gaaccagatt ttggcgcgcc cgccagcccc 
19681 tgctctcaca gatcacggga cgctaccgct 
19741 gaccattact gacgccagac gccgcacctg 
19801 ctcgccgcgc gtcctatcga gccgcacttt 



aacagcagca acgtgcagac caacctggac 
gcgcagcgtg agcgcgcgca gcagcagggc 
ttcctgagta cacagcccgc caacgtgccg 
agcgcactgc ggctaatggt gactgagaca 
gactattttt tccagaccag tagacaaggc 
aagaacttgc aggggctgtg gggggtgcgg 
agcttgctga cgcccaactc gcgcctgttg 
ggcagcgtgt cccgggacac atacctaggt 
ggtcaggcgc atgtggacga gcatactttc 
gggcaggagg acacgggcag cctggaggca 
caaaaaatcc cctcgttgca cagtttaaac 
cagcagagcg tgagccttaa cctgatgcgc 
atgaccgcgc gcaacatgga accgggcatg 
ctaatggact acttgcatcg cgcggccgcc 
ttgaacccgc actggctacc gccccctggt 
ggtaacgatg gattcctctg ggacgacata 
accctgctag agttgcaaca acgcgagcag 
cgcaggccaa gcagcttgtc cgatctaggc 
ccatttccaa gcttgatagg gtctcttacc 
ggcgaggagg agtacctaaa caactcgctg 
ccggcgtttc ccaacaacgg gatagagagc 
tatgcgcagg agcacaggga tgtgcccggc 
gaccgtcagc ggggtctggt gtgggaggac 
gatttgggag ggagtggcaa cccgtttgca 
taaaaaaaag catgatgcaa aataaaaaac 
ttcttgtatt ccccttagta tgcggcgcgc 
ctacgagagc gtggtgagcg cggcgccagt 
tcccctggac ccgccgttcg tgcctccgcg 
catccgttac tctgagttgg cacccctatt 
caagtcaacg gatgtggcat ccctgaacta 
ggtcattcaa aacaatgact acagcccggg 
cgaccggtcg cactggggcg gcgacctgaa 
gaacgagttc atgtttacca ataagtttaa 
taaggacaaa caggtggagc tgaaatacga 
ctactccgag accatgacca tagaccttat 
agtgggcagg cagaacgggg ttctggaaag 
cttcagactg gggtttgacc cagtcactgg 
agccttccat ccagacatca ttttgctgcc 
cctgagcaac ttgttgggca tccgcaagcg 
ctacgatgac ctggagggtg gtaacattcc 
aagcttgaaa gatgacaccg aacagggcgg 
cagcggcgcg gaagagaact ccaacgcggc 
gaacgatcat gccattcgcg gcgacacctt 
ggccgaggca gcggccgaag ctgccgcccc 
gaagcctcag aagaaaccgg tgattaaacc 
caacctaata agcaatgaca gcaccttcac 
ctacggcgac cctcaggccg ggatccgctc 
ctgcggctcg gagcaggtat actggtcgtt 
ccgctccacg cgccagatca gcaactttcc 
ctccaagagc ttctacaacg accaggccgt 
tctgacccac gtgttcaatc gctttcccga 
caccatcacc accgtcagtg aaaacgttcc 
gcgcaacagc atcggaggag tccagcgagt 
cccctacgtt tacaaggccc tgggcatagt 
ttgagcaagc atgtccatcc ttatatcgcc 
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19861 cagcaataac acaggctggg gcctgcgctt cccaagcaag atgtttggcg gggccaagaa 
19921 gcgctccgac caacacccag tgcgcgtgcg cgggcactac cgcgcgccct ggggcgcgca 
19981 caaacgcggc cgcactgggc gcaccaccgt cgatgacgcc atcgacgcgg tggtggagga. 
20041 ggcgcgcaac tacacgccca cgccgccgcc agtgtccacc gtggacgcgg ccattcagac 
20101 cgtggtgcgc ggagcccggc gctacgctaa aatgaagaga cggcggaggc gcgtagcacg 
20161 tcgccaccgc cgccgacccg gcactgccgc ccaacgcgcg gcggcggccc tgcttaaccg 
20221 cgcacgtcgc accggccgac gggcggccat gcgagccgct cgaaggctgg ccgcgggtat 
20281 tgtcactgtg ccccccaggt ccaggcgacg agcggccgcc gcagcagccg cggccattag 
20341 tgctatgact cagggtcgca ggggcaacgt gtactgggtg cgcgactcgg ttagcggcct 
20401 gcgcgtgccc gtgcgcaccc gccccccgcg caactagatt gcaataaaaa actacttaga 
20461 ctcgtactgt tgtatgtatc cagcggcggc ggcgcgcatc gaagctatgt ccaagcgcaa 
20521 aatcaaagaa gagatgctcc aggtcatcgc gccggagatc tatggccccc cgaagaagga 
20581 agagcaggat tacaagcccc gaaagctaaa gcgggtcaaa aagaaaaaga aagatgatga 
20641 tgatgatgaa cttgacgacg aggtggaact gttgcacgcg accgcgccca ggcgacgggt 
20701 acagtggaaa ggtcgacgcg taagacgtgt tttgcgaccc ggcaccaccg tagtctttac 
20761 gcccggtgag cgctccaccc gcacctacaa gcgcgtgtat gatgaggtgt acggcgacga 
20821 ggacctgctt gagcaggcca acgagcgcct cggggagttt gcctacggaa agcggcataa 
20881 ggacatgctg gcgttgccgc tggacgaggg caacccaaca cctagcctaa agcccgtgac 
20941 actgcagcag gtgctgcccg cgcttgcacc gtccgaagaa aagcgcggcc taaagcgcga 
21001 gtctggtgac ttggcaccca ccgtgcagct gatggtaccc aagcgtcagc gactggaaga 
21061 tgtcttggaa aaaatgaccg tggagcctgg gctggagccc gaggtccgcg tgcggccaat 
21121 caagcaggtg gcaccgggac tgggcgtgca gaccgtggac gttcagatac ccaccaccag 
21181 tagcactagt attgccactg ccacagaggg catggagaca caaacgtccc cggttgcctc 
21241 ggcggtggca gatgccgcgg tgcaggcggc cgctgcggcc gcgtccaaga cctctacgga 
21301 ggtgcaaacg gacccgtgga tgtttcgtgt ttcagccccc cggcgtccgc gccgttcaag 
21361 gaagtacggc gccgccagcg cgctactgcc cgaatatgcc ctacatcctt ccatcgcgcc 
21421 tacccccggc tatcgtggct acacctaccg ccccagaaga cgagcaacta cccgacgccg 
21481 aaccaccact ggaacccgcc gccgccgtcg ccgtcgccag cccgtgctgg ccccgatttc 
21541 cgtgcgcagg gtggctcgcg aaggaggcag gaccctggtg ctgccaacag cgcgctacca 
21601 ccccagcatc gtttaaaagc cggtctttgt ggttcttgca gatatggccc tcacctgccg 
21661 cctccgtttc ccggtgccgg gattccgagg aagaatgcac cgtaggaggg gcatggccgg 
21721 ccacggcctg acgggcggca tgcgtcgtgc gcaccaccgg cggcggcgcg cgtcgcaccg 
21781 tcgcatgcgc ggcggtatcc tgcccctcct tattccactg atcgccgcgg cgattggcgc 
21841 cgtgcccgga attgcatccg tggccttgca ggcgcagaga cactgattaa aaacaagtta 
21901 catgtggaaa aatcaaaata aaagtctgga ctctcacgct cgcttggtcc tgtaactatt 
21961 ttgtagaatg gaagacatca actttgcgtc actggccccg cgacacggct cgcgcccgtt 
22021 catgggaaac tggcaagata tcggcaccag caatatgagc ggtggcgcct tcagctgggg 
22081 ctcgctgtgg agcggcatta aaaatttcgg ttccgccgtt aagaactatg gcagcaaagc 
22141 ctggaacagc agcacaggcc agatgctgag ggacaagttg aaagagcaaa atttccaaca 
22201 aaaggtggta gatggcctgg cctctggcat tagcggggtg gtggacctgg ccaaccaggc 
22261 agtgcaaaat aagattaaca gtaagcttga tccccgccct cccgtagagg agcctccacc 
22321 ggccgtggag acagtgtctc cagaggggcg tggcgaaaag cgtccgcgac ccgacaggga 
22381 agaaactctg gtgacgcaaa tagacgagcc tccctcgtac gaggaggcac taaagcaagg 
22441 cctgcccacc acccgtccca tcgcgcccat ggctaccgga gtgctgggcc agcacacacc 
2 2501 cgtaacgctg gacctgcctc cccccgccga cacccagcag aaacctgtgc tgccaggccc 
22561 gtccgccgtt gttgtaaccc gtcctagccg cgcgtccctg cgccgcgccg ccagcggtcc 
22621 gcgatcgttg cggcccgtag ccagtggcaa ctggcaaagc acactgaaca gcatcgtggg 
22681 tttgggggtg caatccctga agcgccgacg atgcttctga tagctaacgt gtcgtatgtg 
22741 tgtcatgtat gcgtccatgt cgccgccaga ggagctgctg agccgccgcg cgcccgcttt 
22801 ccaagatggc taccccttcg atgatgccgc agtggtctta catgcacate tcgggccagg 
22861 acgcctcgga gtacctgagc cccgggctgg tgcagttcgc ccgcgccacc gagacgtact 
22921 tcagcctgaa taacaagttt agaaacccca cggtggcgcc tacgcacgac gtgaccacag 
22981 accggtctca gcgtttgacg ctgcggttca tccccgtgga ccgcgaggat actgcgtact 
23041 cgtacaaggc gcggttcacc ctagctgtgg gtgataaccg tgtgctagac atggcttcca 
23101 cgtactttga catccgcggc gtgctggaca ggggccctac ttttaagccc tactctggca 
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23161 ctgcctacaa cgcactggcc cccaagggtg cccccaactc gtgcgagtgg gaacaaaatg; 
23221 aaactgcaca agtggatgct caagaacttg acgaagagga gaatgaagcc aatgaagctc 
23281 aggcgcgaga acaggaacaa gctaagaaaa cccatgtata tgcccaggct ccactgtccg 
23341 gaataaaaat aactaaagaa ggtctacaaa taggaactgc cgacgccaca gtagcaggtg 
23401 ccggcaaaga aattttcgca gacaaaactt ttcaacctga accacaagta ggagaatctc 
23461 aatggaacga agcggatgcc acagcagctg gtggaagggt tcttaaaaag acaactccca 
23521 tgaaaccctg ctatggctca tacgctagac ccaccaattc caacggcgga cagggcgtta 
23581 tggttgaaca aaatggtaaa ttggaaagtc aagtcgaaat gcaatttttt tccacatcca 
23641 caaatgccac aaatgaagtt aacaatatac aaccaacagt tgtattgtac agcgaagatg 
23701 taaacatgga aactccagat actcatcttt cttataaacc taaaatgggg gataaaaatg 
23761 ccaaagtcat gcttggacaa caagcaatgc caaacagacc aaattacatt gcttttagag 
23821 acaattttat tggtctcatg tattacaaca gcacaggtaa catgggtgtc cttgctggtc 
23881 aggcatcgca gttgaacgct gttgtagatt tgcaagacag aaacacagag ctgtcctacc 
23941 agcttttgct tgattcaatt ggcgacagaa caagatactt ttcaatgtgg aatcaagctg 
24001 ttgacagcta tgatccagat gtcagaatta ttgagaacca tggaactgag gatgagttgc 
24061 caaattattg ctttcctctt ggtggaattg ggattactga cacttttcaa gctgttaaaa 
24121 caactgctgc taacggggac caaggcaata ctacctggca aaaagattca acatttgcag 
24181 aacgcaatga aataggggtg ggaaataact ttgccatgga aattaacctg aatgccaacc 
24241 tatggagaaa tttcctttac tccaatattg cgctgtacct gccagacaag ctaaaataca 
24301 accccaccaa tgtggaaata tctgacaacc ccaacaccta cgactacatg aacaagcgag 
24361 tggtggctcc tgggcttgta gactgctaca ttaaccttgg ggcgcgctgg tctctggact 
24421 acatggacaa cgttaatccc tttaaccacc accgcaatgc gggcctgcgt taccgctcca 
24481 tgttgttggg aaacggccgc tacgtgccct ttcacattca ggtgccccaa aagttttttg 
24541 ccattaaaaa cctcctcctc ctgccaggct catacacata tgaatggaac ttcaggaagg 
24601 atgttaacat ggttctgcag agctctctgg gaaacgacct tagagttgac ggggctagca 
24661 ttaagtttga cagcatttgt ctttacgcca ccttcttccc catggcccac aacacggcct 
24721 ccacgctgga agccatgctc agaaatgaca ccaacgacca gtcctttaat gactaccttt 
24781 ccgccgccaa catgctatat cccatacccg ccaacgccac caacgtgccc atctccatcc 
24841 catcgcgcaa ctgggcagca tttcgcggtt gggccttcac acgcttgaag acaaaggaaa 
24901 ccccttccct gggatcaggc tacgaccctt actacaccta ctctggctcc ataccatacc 
24961 ttgacggaac cttctatctt aatcacacct ttaagaaggt ggccattact tttgactctt 
25021 ctgttagctg gccgggcaac gaccgcctgc ttactcccaa tgagtttgag attaagcgct 
25081 cagttgacgg ggagggctat aacgtagctc agtgcaacat gacaaaggac tggttcctag 
25141 tgcagatgtt ggccaactac, aatattggct accagggctt ctacattcca gaaagctaca 
25201 aagaccgcat gtactcgttc ttcagaaact tccagcccat gagccggcaa gtggtggacg 
25261 atactaaata caaagattat cagcaggttg gaattatcca ccagcataac aactcaggct 
25321 tcgtaggcta cctcgctccc accatgcgcg agggacaagc ttaccccgct aatgttccct 
25381 acccactaat aggcaaaacc gcggttgata gtattaccca gaaaaagttt ctttgcgacc 
25441 gcaccctgtg gcgcatcccc ttctccagta actttatgtc catgggtgcg ctcacagacc 
25501 tgggccaaaa ccttctctac gcaaactccg cccacgcgct agacatgacc tttgaggtgg 
25561 atcccatgga cgagcccacc cttctttatg ttttgtttga agtctttgac gtggtccgtg 
25621 tgcaccagcc gcaccgcggc gtcatcgaga ccgtgtacct gcgcacgccc ttctcggccg 
25681 gcaacgccac aacataaaga agcaagcaac atcaacaaca gctgccgcca tgggctccag 
25741 tgagcaggaa ctgaaagcca ttgtcaaaga tcttggttgt gggccatatt ttttgggcac 
25801 ctatgacaag cgcttcccag gctttgtttc cccacacaag ctcgcctgcg ccatagttaa 
25861 cacggccggt cgcgagactg ggggcgtaca ctggatggcc tttgcctgga acccgcgctc 
25921 aaaaacatgc tacctctttg agccctttgg cttttctgac caacgtctca agcaggttta 
25981 ccagtttgag tacgagtcac tcctgcgccg tagcgccatt gcctcttccc ccgaccgctg 
26041 tataacgctg gaaaagtcca cccaaagcgt gcaggggccc aactcggccg cctgtggcet 
26101 attctgctgc atgtttctcc acgcctttgc caactggccc caaactccca tggatcacaa 
26161 ccccaccatg aaccttatta ccggggtacc caactccatg cttaacagtc cccaggtaca 
26221 gcccaccctg cgccgcaacc aggaacagct ctacagcttc ctggagcgcc actcgcccta 
26281 cttccgcagc cacagtgcgc aaattaggag cgccacttct ttttgtcact tgaaaaacat 
26341 gtaaaaataa tgtactagga gacactttca ataaaggcaa atgtttttat ttgtacactc 
26401 tcgggtgatt atttaccccc acccttgccg tctgcgccgt ttaaaaatca aaggggttct 



FIG. 41 



WO 03/031588 



PCT/US02/32512 



20/92 



2 6461 gccgcgcatc gctatgcgcc actggcaggg acacgttgcg atactggtgt ttagtgctcc 
2 6521 acttaaactc aggcacaacc atccgcggca gctcggtgaa gttttcactc cacaggctgc 
26581 gcaccatcac caacgcgttt agcaggtcgg gcgccgatat cttgaagtcg cagttggggc 
2 6641 ctccgccctg cgcgcgcgag ttgcgataca cagggttaca gcactggaac actatcagcg 
2 6701 ccgggtggtg cacgctggcc agcacgctct tgtcggagat cagatccgcg tccaggtcct 
2 6761 ccgcgttgct cagggcgaac ggagtcaact ttggtagctg ccttcccaaa aagggtgcat 
2 6821 gcccaggctt tgagttgcac tcgcaccgta gtggcatcag aaggtgaccg tgcccagtct 
26881 gggcgttagg atacagcgcc tgcatgaaag ccttgatctg cttaaaagcc acctgagcct 
26941 ttgcgccttc agagaagaac atgccgcaag acttgccgga aaactgattg gccggacagg 
27001 ccgcgtcatg cacgcagcac cttgcgtcgg tgttggagat ctgcaccaca tttcggcccc 
27061 accggttctt cacgatcttg gccttgctag actgctcctt cagcgcgcgc tgcccgtttt. 
27121 cgctcgtcac atccatttca atcacgtgct ccttatttat cataatgctc ccgtgtagac 
27181 acttaagctc gccttcgatc tcagcgcagc ggtgcagcca caacgcgcag cccgtgggct 
27241 cgtggtgctt gtaggttacc tctgcaaacg actgcaggta cgcctgcagg aatcgcccca 
27301 tcatcgtcac aaaggtcttg ttgctggtga aggtcagctg caacccgcgg tgctcctcgt 
27361 ttagccaggt cttgcatacg gccgccagag cttccacttg gtcaggcagt agcttgaagt 
27421 ttgcctttag atcgttatcc acgtggtact tgtccatcaa cgcgcgcgca gcctccatgc 
27481 ccttctccca cgcagacacg atcggcaggc tcagcgggtt tatcaccgtg ctttcacttt 
27541 ccgcttcact ggactcttcc ttttcctctt gcatccgcat accccgcgcc actgggtcgt 
27601 cttcattcag ccgccgcacc gtgcgcttac ctcccttgcc gtgcttgatt agcaccggtg 
27661 ggttgctgaa acccaccatt tgtagcgcca catcttctct ttcttcctcg ctgtccacga 
27721 tcacctctgg ggatggcggg cgctcgggct tgggagaggg gcgcttcttt ttctttttgg 
27781 acgcaatggc caaatccgcc gtcgaggtcg atggccgcgg gctgggtgtg cgcggcacca 
27841 gcgcatcttg tgacgagtct tcttcgtcct cggactcgag acgccgcctc agccgctttt 
27901 ttgggggcgc gcggggaggc ggcggcgacg gcgacgggga cgagacgtcc tccatggttg 
27961 gtggacgtcg cgccgcaccg cgtccgcgct cgggggtggt ttcgcgctgc tcctcttccc 
28021 gactggccat ttccttctcc tataggcaga aaaagatcat ggagtcagtc gagaaggagg 
28081 acagcctaac cgcccccttt gagttcgcca ccaccgcctc caccgatgcc gccaacgcgc 
28141 ctaccacctt ccccgtcgag gcacccccgc ttgaggagga ggaagtgatt atcgagcagg 
28201 acccaggttt tgtaagcgaa gacgacgaag atcgctcagt accaacagag gataaaaagc 
2 8261 aagaccagga cgacgcagag gcaaacgagg aacaagtcgg gcggggggac caaaggcatg 
28321 gcgactacct agatgtggga gacgacgtgc tgttgaagca tctgcagcgc cagtgcgcca 
28381 ttatctgcga cgcgttgcaa gagcgcagcg atgtgcccct cgccatagcg gatgtcagcc 
28441 ttgcctacga acgccacctg ttctcaccgc gcgtaccccc caaacgccaa gaaaacggca 
28501 catgcgagcc caacccgcgc ctcaacttct accccgtatt tgccgtgcca gaggtgcttg 
28561 ccacctatca catctttttc caaaactgca agatacccct atcctgccgt gccaaccgca 
28621 gccgagcgga caagcagctg gccttgcggc agggcgctgt catacctgat atcgcctcgc 
28681 tcgacgaagt gccaaaaatc tttgagggtc ttggacgcga cgagaagcgc gcggcaaacg 
28741 ctctgcaaca agaaaacagc gaaaatgaaa gtcactgtgg agtgctggtg gaacttgagg 
28801 gtgacaacgc gcgcctagcc gtgctgaaac gcagcatcga ggtcacccac tttgcctacc 
28861 cggcacttaa cctacccccc aaggttatga gcacagtcat gagcgagctg atcgtgcgcc 
- 28921 gtgcacgacc cctggagagg gatgcaaact tgcaagaaca aaccgaggag ggcctacccg 
28981 cagttggcga tgagcagctg gcgcgctggc ttgagacgcg cgagcctgcc gacttggagg 
29041 agcgacgcaa gctaatgatg gccgcagtgc ttgttaccgt ggagcttgag tgcatgcagc 
29101 ggttctttgc tgacccggag atgcagcgca agctagagga aacgttgcac tacacctttc 
29161 gccagggcta cgtgcgccag gcctgcaaaa tttccaacgt ggagctctgc aacctggtct 
29221 cctaccttgg aattttgcac gaaaaccgcc ttgggcaaaa cgtgcttcat tccacgctca 
29281 agggcgaggc gcgccgcgac tacgtccgcg actgcgttta cttatttctg tgctacacct 
29341 ggcaaacggc catgggcgtg tggcagcagt gcctggagga gcgcaacctg aaggagctgc 
29401 agaagctgct aaagcaaaac ttgaaggacc tatggacggc cttcaacgag cgctccgtgg 
29461 ccgcgcacct ggcggacatt atcttccccg aacgcctgct taaaaccctg caacagggtc 
29521 tgccagactt caccagtcaa agcatgttgc aaaactttag gaactttatc ctagagcgtt 
29581 caggaattct gcccgccacc tgctgtgcgc ttcctagcga ctttgtgccc attaagtacc 
29641 gtgaatgccc tccgccgctt tggggtcact gctaccttct gcagctagcc aactaccttg 
29701 cctaccactc cgacatcatg gaagacgtga gcggtgacgg cctactggag tgtcactgtc 
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29761 gctgcaacct atgcaccccg caccgctccc tggtctgcast ttcacaactg cttagcgaaa 
29821 gtcaaattat cggtaccttt gagctgcagg gtccctcgcc tgacgaaaag tccgcggctc 
29881 cggggttgaa actcactccg gggctgtgga cgtcggctta ccttcgcaaa tttgtacctg 
29941 aggactacca cgcccacgag attaggttct acgaagacca atcccgcccg ccaaatgcgg 
30001 agcttaccgc ctgcgtcatt acccagggcc acatccttgg ccaattgcaa gccattaaca 
30061 aagcccgcca agagtttctg ctacgaaagg gacggggggt ttacttggac ccccagtccg 
30121 gcgaggagct caacccaatc cccccgccgc cgcagcccta tcagcagccg cgggcccttg 
3 0181 cttcccagga tggcacccaa aaagaagctg cagctgccgc cgccgccacc cacggacgag 
30241 gaggaatact gggacagtca ggcagaggag gttttggacg aggaggagga gatgatggaa 
30301 gactgggaca gcctagacga ggaagcttcc gaggccgaag aggtgtcaga cgaaacaccg 
30361 tcaccctcgg tcgcattccc ctcgccggcg ccccagaaat cggcaaccgt tcccagcatt, 
30421 gctacaacct ccgctcctca ggcgccgccg gcactgcccg ttcgccgacc caaccgtaga 
30481 tgggacacca ctggaaccag ggccggtaag tctaagcagc cgccgccgtt agcccaagag: 
30541 caacaacagc gccaaggcta ccgctcgtgg cgcgtgcaca agaacgccat agttgcttgc 
30601 ttgcaagact gtgggggcaa catctccttc gcccgccgct ttcttctcta ccatcacggc 
30661 gtggccttcc cccgtaacat cctgcattac taccgtcatc tctacagccc ctactgcacc 
30721 ggcggcagcg gcagcaacag cagcggccac gcagaagcaa aggcgaccgg atagcaagac 
30781 tctgacaaag cccaagaaat ccacagcggc ggcagcagca ggaggaggag cactgcgtct 
30841 ggcgcccaac gaacccgtat cgacccgcga gcttagaaac aggatttttc ccactctgta 
30901 tgctatattt caacagagca ggggccaaga acaagagctg aaaataaaaa acaggtctct 
30961 gcgctccctc acccgcagct gcctgtatca caaaagcgaa gatcagcttc ggcgcacgct 
31021 ggaagacgcg gaggctctct tcagcaaata ctgcgcgctg actcttaagg actagtttcg 
31081 cgccctttct caaatttaag cgcgaaaact acgtcatctc cagcggccac acccggcgcc 
31141 agcacctgtc gtcagcgcca ttatgagcaa ggaaattccc acgccctaca tgtggagtta 
31201 ccagccacaa atgggacttg cggctggagc tgcccaagac tactcaaccc gaacaaacta 
31261 catgagcgcg ggaccccaca tgatatcccg ggtcaacgga atccgcgccc accgaaaccg 
31321 aattctcctc gaacaggcgg ctattaccac cacacctcgt aataacctta atccccgtag 
31381 ttggcccgct gccctggtgt accaggaaag tcccgctccc accactgtgg tacttcccag 
31441 agacgcccag gccgaagttc agatgactaa ctcaggggcg cagcttgcgg gcggcttccg 
31501 tcacagggtg cggtcgcccg ggcagggtat aactcacctg aaaatcagag ggcgaggtat 
31561 tcagctcaac gacgagtcgg tgagctcctc tcttggtctc cgtccggacg ggacatttca 
31621 gatcggcggc gctggccgct cttcatttac gccccgtcag gcgatcctaa ccctgcagac 
31681 ctcgtcctcg gagccgcgct ccggaggcat tggaactcta caatttattg aggagttcgt 
31741 gccttcggtt tacttcaacc ccttttctgg acctcccggc cactacccgg accagtttat 
31801 tcccaacttt gacgcggtaa aagactcggc ggacggctac gactgaatga ccagtggaga 
31861 ggcagagcaa ctgcgcctga cacacctcga ccactgccgc cgccacaagt gctttgcccg 
31921 cggctccggt gagttttgtt actttgaatt gcccgaagag catatcgagg gcccggcgca 
31981 cggcgtccgg ctcaccaccc aggtagagct tacacgtagc ctgattcggg agtttaccaa 
32041 gcgccccctg ctagtggagc gggagcgggg tccctgtgtt ctgaccgtgg tttgcaactg 
32101 tcctaaccct ggattacatc aagatcttat tccattcaac taacaataaa cacacaataa 
32161 attacttact taaaatcagt cagcaaatct ttgtccagct tattcagcat cacctccttt 
32221 ccctcctccc aactctggta tttcagcagc cttttagctg cgaactttct ccaaagtcta 
32281 aatgggatgt caaattcctc atgttcttgt ccctccgcac ccactatctt catattgttg 
32341 cagatgaaac gcgccagacc gtctgaagac accttcaacc ctgtgtaccc atatgacacg 
32401 gaaaccggcc ctccaactgt gcctttcctt acccctccct ttgtgtcgcc aaatgggttc 
32461 caagaaagtc cccccggagt gctttctttg cgtctttcag aacctttggt tacctcacac 
32521 ggcatgcttg cgctaaaaat gggcagcggc ctgtccctgg atcaggcagg caaccttaca 
32581 tcaaatacaa tcactgtttc tcaaccgcta aaaaaaacaa agtccaatat aactttggaa 
32641 acatccgcgc cccttacagt cagctcaggc gccctaacca tggccacaac ttcgcctttg 
32701 gtggtctctg acaacactct taccatgcaa tcacaagcac cgctaaccgt gcaagactca 
32761 aaacttagca ttgctaccaa agagccactt acagtgttag atggaaaact ggccctgcag 
32821 acatcagccc ccctctctgc cactgataac aacgccctca ctatcactgc ctcacctcct 
32881 cttactactg caaatggtag tctggctgtt accatggaaa acccacttta caacaacaat 
32941 ggaaaacttg ggctcaaaat tggcggtcct ttgcaagtgg ccaccgactc acatgcacta. 
33001 acactaggta ctggtcaggg ggttgcagtt cataacaatt tgctacatac aaaagttaca 
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33061 ggcgcaatag ggtttgatac atctggcaac atggaactta aaactggaga tggcctctat 
33121 gtggatagcg ccggtcctaa ccaaaaacta catattaatc taaataccac aaaaggcctt 
33181 gcttttgaca acaccgcaat aacaattaac gctggaaaag ggttggaatt tgaaacagac 
33241 tcctcaaacg gaaatcccat aaaaacaaaa attggatcag gcatacaata taataccaat 
33301 ggagctatgg ttgcaaaact tggaacaggc ctcagttttg acagctccgg agccataaca 
33361 atgggcagca taaacaatga cagacttact ctttggacaa caccagaccc atccccaaat 
33421 tgcagaattg cttcagataa agactgcaag ctaactctgg cgctaacaaa atgtggcagt 
33481 caaattttgg gcactgtttc agctttggca gtatcaggta atatggcctc catcaatgga 
33541 actctaagca gtgtaaactt ggttcttaga tttgatgaca acggagtgct tatgtcaaat 
33601 tcatcactgg acaaacagta ttggaacttt agaaacgggg actccactaa cggtcaacca 
33661 tacacttatg ctgttgggtt tatgccaaac ctaaaagctt acccaaaaac tcaaagtaaa 
33721 actgcaaaaa gtaatattgt tagccaggtg tatcttaatg gtgacaagtc taaaccattg 
33781 cattttacta ttacgctaaa tggaacagat gaaaccaacc aagtaagcaa atactcaata 
33841 tcattcagtt ggtcctggaa cagtggacaa tacactaatg acaaatttgc caccaattcc 
33901 tataccttct cctacattgc ccaggaataa agaatcgtga acctgttgca tgttatgttt 
339 61 caacgtgttt atttttcaat tgcagaaaat ttcaagtcat ttttcattca gtagtatagc 
34021 cccaccacca catagcttat actaatcacc gtaccttaat caaactcaca gaaccctagt 
34081 attcaacctg ccacctccct cccaacacac agagtacaca gtcctttctc cccggctggc 
34141 cttaaacagc atcatatcat gggtaacaga catattctta ggtgttatat tccacacggt 
34201 ctcctgtcga gccaaacgct catcagtgat gttaataaac tccccgggca gctcgcttaa 
342 61 gttcatgtcg ctgtccagct gctgagccac aggctgctgt ccaacttgcg gttgctcaac 
34321 gggcggcgaa ggagaagtcc acgcctacat gggggtagag tcataatcgt gcatcaggat 
34381 agggcggtgg tgctgcagca gcgcgcgaat aaactgctgc cgccgccgct ccgtcctgca 
34441 ggaatacaac atggcagtgg tctcctcagc gatgattcgc accgcccgca gcataaggcg 
34501 ccttgtcctc cgggcacagc agcgcaccct gatctcactt aagtcagcac agtaactgca 
34561 gcacagtacc acaatattgt ttaaaatccc acagtgcaag gcgctgtatc caaagctcat 
34621 ggcggggacc acagaaccca cgtggccatc ataccacaag cgcaggtaga ttaagtggcg 
34681 acccctcata aacacgctgg acataaacat tacctctttt ggcatgttgt aattcaccac 
34741 ctcccggtac catataaacc tctgattaaa catggcgcca tccaccacca tcctaaacca 
34801 gctggccaaa acctgcccgc cggctatgca ctgcagggaa ccgggactgg aacaatgaca 
34861 gtggagagcc caggactcgt aaccatggat catcatgctc gtcatgatat caatgttggc 
34921 acaacacagg cacacgtgca tacacttcct caggattaca agctcctccc gcgtcagaac 
34981 catatcccag ggaacaaccc attcctgaat cagcgtaaat cccacactgc agggaagacc 
35041 tcgcacgtaa ctcacgttgt gcattgtcaa agtgttacat tcgggcagca gcggatgatc 
3 5101 ctccagtatg gtagcgcggg tttctgtctc aaaaggaggt agacgatccc tactgtacgg 
35161 agtgcgccga gacaaccgag atcgtgttgg tcgtagtgtc atgccaaatg gaacgccgga 
35221 cgtagtcata tttcctgaag caaaaccagg tgcgggcgtg acaaacagat ctgcgtctcc 
35281 ggtctcgccg cttagatcgc tctgtgtagt agttgtagta tatccactct ctcaaagcat 
35341 ccaggcgccc cctggcttcg ggttctatgt aaactccttc atgcgccgct gccctgataa 
35401 catccaccac cgcagaataa gccacaccca gccaacctac acattcgttc tgcgagtcac 
3 5461 acacgggagg agcgggaaga gctggaagaa ccatgttttt ttttttattc caaaagatta 
■* 35521 tccaaaacct caaaatgaag atctattaag tgaacgcgct cccctccggt ggcgtggtca 
35581 aactctacag ccaaagaaca gataatggca tttgtaagat gttgcacaat ggcttccaaa 
35641 aggcaaacgg ccctcacgtc caagtggacg taaaggctaa acccttcagg gtgaatctcc 
35701 tctataaaca ttccagcacc ttcaaccatg cccaaataat tctcatctcg ccaccttctc 
35761 aatatatctc taagcaaatc ccgaatatta agtccggcca ttgtaaaaat ctgctccaga 
35821 gcgccctcca ccttcagcct caagcagcga atcatgattg caaaaattca ggttcctcac 
3 5881 agacctgtat aagattcaaa agcggaacat taacaaaaat accgcgatcc cgtaggtccc 
35941 ttcgcagggc cagctgaaca taatcgtgca ggtctgcacg gaccagcgcg gccacttccc 
36001 cgccaggaac catgacaaaa gaacccacac tgattatgac acgcatactc ggagctatgc 
3 6061 taaccagcgt agccccgatg taagcttgtt gcatgggcgg cgatataaaa tgcaaggtgc 
3 6121 tgctcaaaaa atcaggcaaa gcctcgcgca aaaaagaaag cacatcgtag tcatgctcat 
3 6181 gcagataaag gcaggtaagc tccggaacca ccacagaaaa agacaccatt tttctctcaa 
3 6241 acatgtctgc gggtttctgc ataaacacaa aataaaataa caaaaaaaca tttaaacatt 
36301 agaagcctgt cttacaacag gaaaaacaac ccttataagc ataagacgga ctacggccat 
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36361 gccggcgtga ccgtaaaaaa actggtcacc gtgattaaaa agcaccaccg acagctcctc 

36421 ggtcatgtcc ggagtcataa tgtaagactc ggtaaacaca tcaggttgat tcacatcggt 

36481 cagtgctaaa aagcgaccga aatagcccgg gggaatacat acccgcaggc gtagagacaa 

3 6541 cattacagcc cccataggag gtataacaaa attaatagga gagaaaaaca cataaacacc 

36601 tgaaaaaccc tcctgcctag gcaaaatage accctcccgc tccagaacaa catacagcgc 

36661 ttccacagcg gcagccataa cagtcagcct taccagtaaa aaagaaaacc tattaaaaaa 

36721 acaccactcg acacggcacc agctcaatca gtcacagtgt aaaaaagggc caagtgcaga 

36781 gcgagtatat ataggactaa aaaatgacgt aacggttaaa gtccacaaaa aacacccaga 

36841 aaaccgcacg cgaacctacg cccagaaacg aaagccaaaa aacccacaac ttcctcaaat 

36901 cgtcacttcc gttttcccac gttacgtcac ttcccatttt aagaaaacta caattcccaa 

36961 cacatacaag ttactccgcc ctaaaaccta cgtcacccgc cccgttccca cgccccgcgc 

37021 cacgtcacaa actccacccc ctcattatca tattggcttc aatccaaaat aaggtatatt 
37081 attgatgatg 
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10 30 50 

ATGGCGCCC ATC ACGGCCTACTCCCAACAGACGCGGGGCCTACTTGGTTGCATCATCACT 

. + - — . + + + + -+ 

MetAlaProIleThrAlaTyrSerGlnGlnThrArgGlyLeuLeuGlyCysIlelleThr 

10 20 

70 90 110 

AGCCTTACAGGCCGGGACAAGAACCAGGTCGAGGGAGAGGTTCAGGTGGTTTCCACCGCA 

+ + + + + + 

SerLeuThrGlyArgAspLysAsnGlnValGluGlyGluValGlnValValSerThrAla 

30 40 

130 150 170 

ACACAATCCTTCCTGGCGACCTGCGTCAACGGCGTGTGTTGGACCGTTTACCATGGTGCT 

— __ — . + + + + + + 

ThrGlnSerPheLeuAlaThrCysValAsnGlyValCysTrpThrValTyrHisGlyAla 

50 , 60 

190 210 230 

GGCTCAAAGACCTTAGCCGGCCCAAAGGGGCCAATCACCCAGATGTACACTAATGTGGAC 

_: ;-- + + + + + -- + 

GlySerLysThrLreixAlaGlyProLysGlyProIleThrGlnMetTyrThrAsnValAsp 

70 80 

250 270 290 

CAGGACCTCGTCGGCTGGCAGGCGCCCCCCGGGGCGCGTTCCTTGACACCATGCACCTGT 

+ + + + + + 

GlnAspLeuValGlyTrpGlnAlaProProGlyAlaArgSerLeuThrProCysThrCys 

90 100 

310 330 350 

GGCAGCTCAGACCTTTACTTGGTCACGAGACATGCTGACGTCATTCCGGTGCGCCGGCGG 

+ H + + + + 

GlySerSerAspLeuTyrLeuValThrArgHisAlaAspVallleProValArgArgArg 

110 120 



370 390 410 

GGCGACAGTAGGGGGAGCCTGCTCTCCCCCAGGCCTGTCTCCTACTTGAAGGGCTCTTCG , 



GlyAspSerArgGlySerLeuLeuSerProArgProValSerTyrLeuLysGlySerSer 

130 140 
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430 450 470 

GGTGGTCCACTGCTCTGCCCTTCG(^CACGCTC 

GlyGlyProlieuLeuCysProSerGlyHisAiaValGlyllePheArgAlaAlaValCys 

150 160 

490 510 530 

ACCCGGGGGGTTGCGAAGGCGGTGGACTT^ 

+--- + + -____+ ._ + _- + 

ThrArgGlyValAlaLysAlaValAspPheValProValGluSerMetGluThrThrMet 

170 180 

550 570 590 

CGGTCTCCGGTCTTCACGGACAACTC 

+ + ---.-4- — + — + 

ArgSerProValPheThrAspAsnSerSerProProAlaValProGlnSerPtieGlnVal 

190 200 

610 630 650 

GCCCACCTACAC<^TCCCACTGGCAGCGGCAAGAGTACTAAAGT 

". ___+__ + + + + — + 

AlaHisLettHisAlaProThrGlySerGlyLysSerThrLysValProAlaAlaTyrAla. 

210 220 

670 690 710 

GCCCAAGGGTACAAGGTGCTCGTCCTCAATCCGTCCGTTGCCGCTACCTTAGGGTTTGGG 

.„ — + + + ^_^__ + ----- + - + 

AlaGlnGiyTyrLysValDeuValLe^ 

230 240 

730 750 770. 

GCGTATATGTCTAAGGCAGACGGTATTCACCCCAACATCAGAACTGGGGTAAGGACCATT 

+ + - + : + - + + 

AlaTyrMetSerLysAlaHisGlylleAspProAsnlleArgThrGlyValArgThrlle 

250 ' 260 

790 810 830 

ACC ACAGGCGCCCCCGTC AC AT ACTG TAGGTATGGC AAGTTTCTTGCCG ATCGTGGTTGG 

Thi^hr^lyAlaProValTh^^ 

270 * 280 
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850 870 890 

TCTGGGGGCGCTTATGACATCATAATATGTGATGAGTGCCATTCAACTGACTCGACTACA 

: _+__ _+ + +-- + + 

SerGlyGlyAlaTyrAspIlellelleCysAspGluCysHisSerThrAspSerThrThr 

290 300 

910 930 950 

ATCTTGGGCATCGGCACAGTCCTGGACCAAGCGGAGACGGCTGGAGCGCGGCTTGTCGTG 



IleLeuGlylleGlyXlirValLexiAspGlxiAlaGluTtoAlaGlyAlaArglieuValVal 

310 320 

970 i 990 1010 

CTCGCCACCGCTACGCCTCCGGGATCGGTCACCGTGCCACACCCAAACATCGAGGAGGTG 

_ + + --_ + + H + 

LeiiAlaThrAlaThrProProGlySerValThrValProHisProAsnlleGluGluVal 

330 340 

1030 1050 1070 

GCCCTGTCTAATACTGGAGAGATCCCCTTCTATGGCAAAGCCATCCCCATTGAAGCCATC 

+ + . + + H + 

AlaLeuSerAsnThrGlyGluIleProPheTyrGlyliysAlalleProIleGluAlalle 

350 360 

1090 1110 1130 

AGGGGGGGAAGGCATGTCATTTTCTGTCATTCCAAGAAGAAGTGCGACGAGCTCGCCGCA 

+ + + + + + 

ArgGlyGlyArgHisLeuIlePheCysHisSerLysLysLysCysAspGluIiexjLAlaAla 

370 380 

1150 1170 1190 

AAGCTGTCAGGCCTCGGAATCAACGCTGTGGCGTATTACCGGGGGCTCGATGTGTCCGTC 

_„ + „ + - + + -+ + 

LysLeuSerGlyLeuGlylleAsnAlaValAlaTyrTyrArgGlyLeuAspValSerVal 

390 400 

1210 1230 1250 

ATACCAACTATCGGAGACGTCGTTGTCGTGGCAACAGACGCTCTGATGACGGGCTATACG 

; + + ■--+ r + + + 

IleProThrlleGlyAspValValValValAlaThrAspAlaLeiiMetThrGlyTyrThr 

410 420 
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1270 1290 1310 

GGCGACTTTGACTCAGTGATCGACT^ 

, ■ _ + --^ — + + + 

GlyAspPheAspSerVallleAspCysAsnThrCysValThrGlnThrValAspPheSer 

430 440 

1330 1350 1370 

TTGGATCCCACCTTCACCATTC 

+ + .__ + , ^____ + . . : + + 

LeuAspProThrPheThr I leGluThrThrThrValProGlnAspAlaValSerArgSer 

450 460 

1390 1410 1430 

CAGCGGCGGGGTAGGACTGGCAC^GGTAGGAGA 

+ — + - + + — ■ --+ 

GlnArgArgGlyArgThrGlyArgGlyArgArgGlylleTyrArgPheVa 

470. 480 

1450 1470 1490 

GAACG<^CCTCGGGCATGTTCGAtTCCTC 

GluArgProSerGlyMetPheAspSerSerValL^ 

490 500 

1510 1530 1550 

GCTTGGTACGAGCTCACCCCCGCCGA 

AlaTrpiVrGluLeuThrProAlaGluThrSerValArgLeuArgAlaTyrLeuAsnThr 

510 520 

1570 1590 1610 

CCAGGGTTGCCCGTTTGCGAGGACCACCTGGAG 

— i+- + ... — ;-^_ + _; 

PrbGlyLeuProValCysGlnAspHisLe^^ 

530 540 

1630 1650 1670 

ACCCACATAGATGCACACTTCTTC 

ThrHisIleAspAlaHisPheLeuSerGlrm 

550 560 
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1690 1710 1730 

CTGGTAGCATACCAAGCCACGGTGTGCGCCAGGGCTCAGGCCCCACCTCCATCATGGGAT 

. + + + + - + + 

XieuValAlaTyrGlnAlaThrValCysAlaArgAlaGlnAlaProProProSerTrpAsp 

570 580 

1750 1770 1790 

CAAATGTGGAAGTGTCTCATACGGCTGAAACCTA^^ 

+ + +. + + — + 

GlnMetTrpLysCysLeuIleArgLexiIjysProThrLeuHi sGlyProThr ProLeuLeu 

590 600 

1810 1830 1850 

TAC AGGCTGGGAGCCGTCCAAAATGAGGTCACCCTC ACCCACCCC ATAACC AAATACATC 

— -- + + + - + + + 

TyrArgLeuGlyAlaValGliiAsnGluValThrlieuThrHisProIleThrliysTyrlle 

610 620 

1870 1890 1910 

ATGGCATGCATGTCGGCTGACCTGGAGGTCGTCACTAGCACCTGGGTGCTGGTGGGCGGA 

+ + + — + + 

MetAlaCysMetSerAlaAspLeuGluValValThrSerThrTrpValLeuValGlyGly 

630 640 

1930 1950 1970 

GTCCTTGCAGCTCTGGCCGCGTATTGGCTGACAACAGGCAGTGTGGTCATTGTGGGTAGG 

+ + + + •*- + 

ValLeuAlaAlaLeuAlaAlaTyrCysLeuThrThrGlySerValVallleValGlyArg 

650 660 

1990 2010 2030 

ATTATCTTGTCCGGGAGGCCGGCTATTGTTCCCGACAGGGAGTTTCTCTACCAGGAGTTC 

+ + + + + ■ + 

IlelleLeuSerGlyArgProAlalleValProAspArgGluPheLeuTyrGlnGluPhe 

670 680 

2050 2070 2090 

GATGAAATGGAAGAGTGCGCCTCGCACCTCCCTTACATCGAGCAGGGAATGCAGCTCGCC 

+ + + + + . 

AspGluMetGluGluCysAlaSerHisLeuProTyrlleGluGlnGlyMetGlnLeuAla 

690 700 
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2110 2130 2150 

gagcaattcaagcagaaagcgctc 

+ _^.« + + + : + + 

GluGlnPheLysGlnLysAlaLeuGlyLexiIieuGlnThrAlaThrLysGlnAlaGluAla 

710 720 

2170 2190 2210 

GCTGCTCCCGTGGTGGAGTCCAAGTGGCGAGCCCTTGAGACATTCTGGGCGAAGCACATG 

_ ^ , , + + + + • + ^ + 

AlaAlaProValValGluSerLysTrpArgAlaLeuGluThrPheTrpAlaLysHisMet 

730 740 

2230 2250 2270 

TGGAATOTCATCAGCGGGATACAGTACTTAGCAGGCT^^ 

+ + + + + . + 

TrpAsnPhelleSerGlylleGlnTyrLeuAlaGlyLeuSerThrLeuProGlyAsnPro 

750 760 

2290 2310 2330 

GCAATAGCATCATTGATGGCATTCACAGCCTCTA 

+ _ + + v : — + ---+ 

AlaileAlaSerLeuMetAlaPheThrAlaSerlleThrSerProLeuThrThrGlnSer 

770 780 

2350 2370 2390 

ACCCTCCTGTTTAACATCTTGGGGGGGTGGGTGGCTGCCCAACTCGCGCCCCCCAGCGCG 

ii.^, _ + - + + + - --: + 

ThrtieuLeuPheAsnlleLeuGlyGlyTrpValAlaAlaGlnLeuAlaProProSerAla 

790 800 

2410 2430 2450 

GCTTCGGCTTTCGTGGGCGCCGGGATCGCGGGTGCGG 

+ .+ + . --+ + — -+ 

AlaSerAlaPheValGlyAlaGlylleAlaGlyAlaAlaValGlySerlleGlyLeuGly 

810 820 

2470 2490 2510 

AAGGTGCTTGTGGACATTCTGGCGK3GTTATGGAGCAGGAGTGGGCGGCGCGCTC 

_ + + -w + + - -- + -■ • + 

LysValLeuValAspIleLeuAlaGlyTyrGlyAlaGlyValAlaGlyAlaLeuValAla 

830 840 



FIG. 5F 



WO 03/031588 



PCT/US02/32512 



30/92 



2530 2550 2570 

TTCAAGGTCATGAGCGGCGAGATGCCCTCCACCGAGGACCTGGTCAATCTACTTCCTGCC 

PheLysValMetSerGlyGluMetProSerThrGluAspLeuValAsnLeuLeuProAla 

850 860 

2590 2610 2630 

ATCCTCTCTCCTGGCGCCCTGGTCGTCGGGGTCGTGTGTGCAGCAATACTGCGTCGACAC 

.| + ■ + + + + 

IleLeuSerProGlyAlaLeuValValGlyValValCysAlaAlalleLeuArgArgHis 

870 880 

2650 2670 2690 

GTGGGTCCGGGAGAGGGGGCTGTGCAGTGGATGAACCGGCTGATAGCGTTCGCCTCGCGG 

_ + + _ + + + - + 

ValGlyProGlyGluGlyAlaValGlnTrpMetAsnArgLeuIleAlaPheAlaSerArg 

890 900 

2710 2730 . 2750 

GGTAATCATGTTTCCCCCACGCACTATGTGCCTGAGAGCGACGCCGCAGCGCGTGTTACT 



- + 



GlyAsnHisValSerProThrHisTyrValProGluSerAspAlaAlaAlaArgValThr 

910 920 

2770 2790 2810 

CAGATCCTCTCCAGCCTTACCATCACTCAGCTGCTGAAAAGGCTCCACCAGTGGATTAAT 

, m + + + H + 

GlnlleLeuSerSerLeuThrlleThrGlnLeuIjeuLysArgLeuHisGlnTrpIleAsn 

930 940 

2830 2850 2870 

GAAGACTGCTCGACACCGTGTTCCGGCTCGTGGCTAAGGGATGTTTGGGACTGGATATGC 

; + + + + . + + 

GluAspCysSerThrProCysSerGlySerTrpLeuArgAspValTrpAspTrpIleCys 

950 960 

2890 2910 2930 

AC^TGTTGACTGACTTCAAGACCTGGCTCCAGTCCAAGCTCCTGCCGeAGCTACCGGGA 

; + + _. + + + + 

ThrValLeuThrAspPheLysThrTrpLeuGlnSerLysLeuLeuProGlnLeuProGly 

970 980 
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2950 2970 2990 

GTCCCTl^TTCTCGTGCCAAGGCGGGTACAAGGGAGTGTGGCGGGGAGACGGCATCATG 

. __ + + + + ;. + 

ValProPhePheSerCysGlnArgGlyTyrLysGlyValTrpArgGlyAspGlylleMet 

990 1000 

3010 3030 3050 

CAAACCACCTGCCCATGTGGAGCACAGATCACCGGACATGTCAAAAACGGTTCCATGAGG 

+ + + + 

GlnThrThr(^sProCysGlyAlaGlnlleThrGlyHisValLysAsnGlySerMetArg 

1010 1020 

3070 3090 3110 

ATCGTCGGGCCTAAGACCTCGAGCAACAGGTGGCATGGAACA 

+ ; + + T- + + 

IleValGlyProLysThrCysSerAsnTh^^ 

1030 1040 

3130 3150 3170 

ACCACGGGCCCCTGCACACCCTCTCCAGCGCCAAACTATTC 

+ . + + + + 

ThrThrGlyProCysThrProSerPiroAlaProAsnTyrSerArgAlaLeuTrpArgVal 

1050 1060 

3190 3210 323Q 

GCCGCTGAGGAGTACGTGGAGGTCACGCGGGTGGGGGATTTCC ACTACGTGACGGGCATG 

: + + _^-_ + + + - + 

AlaAlaGluGluTyrValGluValThrArgValGlyAspPheHisTyrValThrGlyMefe 

1070 1080 

3250 3270 3290 

ACCACTGACAACGTAAAGTGCGCATGCCAGGTTCCGGGTGCTGAATTCTTCACGGAGGTG 

.__ + . -+ + + ^-^ + 

ThrThrAspAsnValLysCysProGysGlnValPrbAlaProGluPhePheTh 

1090 1100 

3310 3330 3350 

GAGGGAGTGCGGTtGGACAGGTAGGCTC 

+ _^-. . + _ + +- -+ + 

AspGlyValArgLeuHisArgTyrAlaProAlaGysArgProLeuLeuArgGluGluVal 

1110 1120 
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3370 3390 3410 

ACATTCCAGGTCGGGCTCAACCAATACCTGGTTGGGTCACAGCTACCATGCGAGCCCGAA 



ThrPheGlnValGlyLexoAsnGlnTyrLeuValGlySerGlnLeuProCysGluProGlu 

1130 1140 

3430 3450 3470 

CCGGATGTAGCAGTGCTCACTTCCATGCTCACCGACCCCTCCCACATCACAGCAGAAACG 

_+___ + - + -. + + - + 

ProAspValAlaValLeuThrSerMetLeuThrAspProSerHisIleThrAlaGluThr 

1150 1160 

3490 3510 3530 

GGTAAGCGTAGGTTGGCCAGGGGGTCTCCCCCCTCCTTGGCCAGCTCTTCAGCTAGCCAG 



AlaLysArgArgLeuAlaArgGlySerProProSerLeuAlaSerSerSerAlaSerGln 

1170 1180 



3550 3570 3590 

TTGTCTGCGCCTTCCTTGAAGGCGACATGCACTACCCACCATGTCTCTCCGGACGCTGAC 

+ + + -- + +- + 

LeuSerAlaProSerLeuLysAlaThrCysThrThrHisHisValSerProAspAlaAsp 

1190 1200 

3610 3630 3650 

CTCATCGAGGCCAACGTCCTGTGGCGGCAGGAGATGGGCGGGAACATCACCCGCGTGGAG 

+ + + + + + 

LeuIleGl\iAlaAsnLeuLeuTrpArgGlnGl\iMetGlyGlyAsnIleThrArgValGlu 

1210 1220 

3670 3690 3710 

TCGGAGAACAAGGTGGTAGTCCTGGACTCTTTCGACCCGCTTCGAGCGGAGGAGGATGAG 

+ + + + + 

SerGl\xAsnLysValValValLeuAspSerPheAspProLe\iArgAlaGluGl\iAspGlu 

1230 1240 

3730 3750 3770 

AGGGAAGTATCCGTTCCGGCGGAGATCCTGCGGAAATCCAAGAAGTTCCCCGCAGCGATG 

H + + + + + 

ArgGluValSerValProAlaGluIleLeuArgLysSerLysLysPheProAlaAlaMet 

1250 1260 



FIG. 51 



WO 03/031588 



PCT/US02/32512 



33/92 



3790 3810 3830 

CCCATCTGGGCGCGCCCGGATTACAACCCTCCA^ 

; + .„ + + + + --- + 

ProIleTrpAlaArgProAspTyrAsnProProLeuL^ 

1270 1280 

3850 3870 3890 

TACGTCCCTGCGGTGGTGCACGG 

. + ; + + + + ■ -- + 

TyrValProProValValHisGlyCysProIieuProProIleLysAlaProProllePro 

1290 1300 

3910 3930 3950 



ProProArgArgLysArgThrValValLeuThrGluSerSerValSerSerAlalieuAla 

1310 1320 

3970 3990 4010 

GAGCTCGCTACTAAGACCTTCGGCAGCTCCGAATCATCGGCCGTCGACAGCGGCACGGCG 

^___ + _ + --_ + — + + •+ 

GluLeuAlaThrLysThrPheGlySerSerGluSerSerAlaValAspSerGlyThrAla 

1330 1340 

4030 4050 4070 

ACCGCGCTTCCTGACCAGGCCTCCGACGACGG 

ThrAlaLeuProAspGlnAlaSerAspAspGlyAspLysGlySerAspValGluSerTyr 

1350 1360 

4090 4110 4130 

TCCTCCATCCCCCCCGTTGAGGGGGAAGCGGGGGACCGCG^ 

SerSerMetPrOProLeuGluGlyGluProGlyAspProAspLeuSerAs 

1370 1380 

4150 4170 4190 

TCTACCGTGAGCGAGGAAGC 

SerThrValSerGluGluAlaSerGluAspValValCysCysSerMetSerTyrThrT^ 

1390 1400 
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4210 4230 4250 

ACAGGCGCGTTGATCACGCCATGCGCTGCGGAGGAAAGCAAGCTGCCCATCAACGCGTTG 

+ . + + . + + + 

ThrGlyAlalieuIleThrProCysAlaAlaGluGluSerLysLeuProIleAsnAlaLeu 

1410 1420 

4270 4290 4310 

AGCAACTCTTTGCTGCGCCACCATAACATGGTTTATGCCACAACATCTCGCAGCGCAGGC 

_ + . + + + + + 

SerAsnSerLeviLeiiArgHisHisAsnMetValTyrAlaThrThrSerArgSerAlaGly 

1430 1440 

4330 4350 4370 

CTGCGGCAGAAGAAGGTCACCTTTGACAGACTGCAAGTCCTGGACGACCACTACCGGGAC 

+ + + + + -+ 

lieuArgGlnLysLysValThrPheAspArgLeuGlnValLeuAspAspHisTyrArgAsp 

1450 1460 

4390 4410 4430 

GTGCTCAAGGAGATGAAGGCGAAGGCGTCCACAGTTAAGGCTAAACTCCTATCCGTAGAG 

_ + __ + + + + - + 

ValLeuLysGluMetLysAlaIiysAlaSerThrValLysAlaLysI.euIieuSerValGlu 

1470 1480 

4450 4470 4490 

GAAGCCTGCAAGCTGACGCCCCCACATTCGGCCAAATCCAAGTTTGGCTATGGGGCAAAG 

+ + + + + + 

GlixAlaC^sLysLeuThrProProHisSerAlaLysSerLysPheGlyTyrGlyAlaLys 

1490 1500 

4510 4530 4550 

GACGTCCGGAACCTATCCAGCAAGGCCGTTAACCACATCCACTCGGTGTGGAAGGACTTG 

; + + + + + + 

AspValArgAsnLeuSerSerLysAlaValAsnHisIleHisSerValTrpLysAspLeu 

1510 1520 



4570 4590 4610 

CTGGAAGAGACTGTGACACCAATTGACACCACCATCATGGCAAAAAATGAGGTTTTCTGT 



LeuGluAspThrValThrProlleAspThrThrlleMetAlaLysAsnGluValPheCys 

1530 1540 
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4630 4650 4670 

GTCCAAGCAGAGAAAGGAGGCCGTAAGCGAGGCCGCCTTATCGTAOTCCCAGATCTGGGA 

: + + . + + + + 

ValGlnProGluLysGlyGlyArgLysProAlaArgLeuIleValPhePrqAspLeuGly 

1550 1560 

4690 4710 4730 

GTCCGTGTATGCGAGAAGATGGCCCTCTATGATGTGGTCTCCACCCTTCCTCAGGTCGTG 

+ + • + + — --- + — + 

ValArgValCysGluLysMetAlaLeuTyrAspValValSerThrLeuProGlnValVal 

1570 1580 

4750 4770 4790 

ATGGGCTGCTCATACGGATTCCAGTACTCTCCTGGGCAGCGAGTCGAGTTCGTGGTGAAT 

+ + + .— + + + 

MetGlySerSerTVrGlyPheGlnTyrSerProGlyGlnArgValGluPhelieuValAsn 

1590 1600 

4810 4830 4850 

ACCTGGAAATCAAAGAAAAACCCCATGGGCTTTTCATATGACAC 

+ + + + ^ + • -- + 

ThrTrpLysSerLysLysAsnProMe^ 

.1610 1620 



4870 4890 4910 

ACGGTGAGCGAGAACGAGATCCGTGTTGAGGAGTGAATTTACCAATGTTGTGACTTGGCC 



ThrValThrGlxiAsnAspIleArgValGluGluSerlleTyrGlnCysCysAspLeuAla 

1630 1640 

4930 4950 4970 

CCGGAAGCCAGACAGGGGATAAAATGGCTCACAGAGCGGCTTTATATCGGGGGTCCTCTG 

+ + : +- + + + 

ProGluAiaAxgGlnAlalleLysSerLeuThrGluArgLeuTyrlleGlyGlyProLeu 

1650 1660 

4990 5010 5030 

ACTAATTCAAAAGGGCAGAACTCCGGTTATCGCCGGTGCCGGGGGAGCGGGGTGCTGAGG 

— - + + . — + +: + + . 

ThrAsnSerLysGlyGlriAsnCysGlyTyrArgArgGysArgAlaSerGlyValLeuThr 

1670 1680 
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5050 5070 5090 

ACTAGCTGCGGTAACACCCTCACATGTTACTTGAAGGCCTCTGCAGCCTGTCGAGCTGCG 

— _ + — +- — + — + + + 

ThrSerCysGlyAsnThrLeuThrCysTyrLeuLysAlaSerAlaAlaCysArgAlaAla 

1690 1700 

5110 5130 5150 

AAGCTCCAGGACTGCACGATGCTCGTGAACGGAGACGACCTTGTCGTTATCTGTGAAAGC 

+ + + + +- + 

IiysLeuGlnAspCysThrMetLeuValAsnGlyAspAspLeuValVallleCysGluSer 

1710 1720 

5170 5190 5210 

GCGGGAACCCAAGAGGACGCGGCGAGCCTACGAGTCTTCACGGAGGCTATGACTAGGTAC 

_ + + . + + + + 

AlaGlyThrGlnGluAspAlaAlaSerLeuAr gVal PheThrGluAlaMe tThr ArgTyr 

1730 1740 

5230 5250 5270 

TCTGCCCCCCCCGGGGACCCGCCCCAACCAGAATACGACTTGGAGCTGATAACATCATGT 

+ __ + + + + + 

SerAlaProProGlyAspProProGlnProGluTyrAspLeuGluLeuIleThrSerCys 

1750 1760 

5290 5310 5330 

TCCTCCAATGTGTCGGTCGCCCACGATGCATCAGGCAAAAGGGTGTACTACCTCACCCGT 

+ + + + + + 

SerSerAsnValSerValAlaHisAspAlaSerGlyLysArgValTyrTyrLeuThrArg 

1770 1780 

5350 5370 5390 

GATCCCACCACCCCCCTCGCACGGGCTGCGTGGGAAACAGCTAGACACACTCCAGTTAAC 

„_ + i.. + + + + + 

AspProThrThrProLeuAlaArgAlaAlaTrpGluThrAlaArgHisThrProValAsn 

1790 1800 

5410 5430 5450 

TCCTGGCTAGGCAACATTATCATGTATGCGCCCACTTTGTGGGCAAGGATGATTCTGATG 

_ + + + ; + + + 

SerTrpLeuGlyAsnllelleMetTyrAlaProThrLeuTrpAlaArgMetlleLeuMet 

1810 1820 
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5470 5490 5510 

ACTCACTTCTTCTCCATCCTTCTAGGACAGGAGCAACTTGAAAAAGCCCTGGACTGCCAG 

^ ; + + _+-_-- - + 

ThrHisPhePheSerlleLexUieuAlaGlnGluGlhLeuGlxiLysAlaLeuA^pCysGln 

1830 1840 

5530 5550 5570 

ATCTACGGGGCCTGTTACTCCATTGAGCCACTTGACCTACCTCAGATCATTGAACGACTC 

4 + + + + , + 

IleTyrGlyAlaCysTyrSerlleGluProLeiiAspLeuProGlnllelleGlxiArgLeu 

1850 I860 

5590 5610 5630 

CATGC^CTTAGCGCATTTTCACTCCATAGTTACTCTCCAGGTGAGATCAATAGGGTGGCT 

; + + + + ;- + + 

HisGlyLeuSerAlaPheSerLeuHisSerTyrSerProGlyGluIleAsriArgValAla 

1870 1880 

5650 5670 5690 

TCATGCCTCAGGAAACTTGGGGTACCACCCTTGCGAGTCTGGAGACATCGGGCCAGGAGC 

+ + — + + + 

SerCysLeuArgLysLeuGlyValProProLieuArgValTrpArgHisArgAlaArgSer 

1890 1900 

5710 5730 , 5750 

GTCCGCGCTAGGCTACTGTCCCAGGGK3GGGAGGGCCGCCACTTGTGGCAAGTACCTCTTC 

. + + + + + + 

ValArgAlaArgLeuLeuSerGlnGlyGlyArgAlaAlaThrCysGlyLysTyrLeuPhe 

1910 1920 

5770 5790 5810 

AACTGGGCAGTGAAGACCAAACTCAAACTCACTCCAATCCCGGGTGCGTCCCAGCTGGAC 

„ + , +--- + • + + 

AsnTrpAlaValLysThrLysLeuLysLeuThrProIleProAlaAlaSerGlnLeuAsp 

1930 1940 

5830 5850 5870 

TTGTCCGGCTGGTTCGTTGCTGGTTACAGCGGGGGAGACATATATCACAGCCTGTCTCGT 

+_ -| + - + + + 

LeuSerGlyTirpPheValAlaGlyTyrSerGlyGlyAspIleTyrHisSerLeuSerArg 

1950 1960 
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5890 5910 5930 

GCCCGACCCCGCTGGTTCATGCTGTGCCTAGTCCTACTTTCTGTAGGGGTAGGCATCTAC 



AlaArgProArgTrpPheMetLeuCysLeuLexalieuIieuSerValGlyValGlylleTyr 

1970 1980 

5950 5955 
CTGCTCCCCAACCGA (SEQ. ID. NO. 5) 



LeuLeuProAsnArg (SEQ. ID. NO. 6) 
1985 
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1 


TCGC GCGTTT 


CGGTGATGAC 


GGTGAAAACC 


TCTGACACAT 


GCAGCTCCCG 


51 


GAGACGGTCA 


CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA GACAAGCCCG 


101 


TCAGGGCGCG 


TCAGCGGGTG 


TTGGCGGGTG 


TCGGGGCTGG 


CTTAACTATG 


151 


CGGCATCAGA 


GCAGATTGTA 


CTGAGAGTGC 


ACCATATGCG 


GTGTGAAATA 


201 


CCGCACAGAT 


GCGTAAGGAG 


AAAATACCGC 


ATCAGATTGG 


CTATTGGCCA 


251 


TTGCATACGT 


TGTATCCATA 


TCATAATATG 


TACATTTATA 


TTGGCTCATG 


301 


TCCAACATTA 


CCGCCATGTT 


GACATTGATT 


ATTGACTAGT 


TATTAATAGT 


351 


AATCAATTAC 


GGGGTCATTA 


GTTCATAGCC 


CATATATGGA 


GTTCCGCGTT 


401 


ACATAACTTA 


CGGTAAATGG 


CCCGCCTGGC 


TGACCGCCCA 


ACGACCCCCG 


451 


GCCATTGACG 


TCAATAATGA 


CGTATGTTCG 


CATAGTAACG 


CCAATAGGGA 


501 


CTTTCCATTG 


ACGTCAATGG 


GTGGAGTATT 


TACGGTAAAC 


TGCCCACTTG 


551 


GCAGTACATC 


AAGTGTATCA 


TATGCCAAGT 


ACGCCCCCTA 


TTGACGTCAA 


601 


TGACGGTAAA 


TGGCCCGCCT 


GGCATTATGC 


CCAGTACATG 


AGCTTATGGG 


651 


ACTTTCCTAC 


TTGGCAGTAC 


ATCTACGTAT 


TAGTCATCGC 


TATTACCATG 


701 


GTGATGCGGT 


TTTGGCAGTA 


CATCAATGGG 


CGTGGATAGC 


GGTTTGACTG 


751 


ACGGGGATTT 


CCAAGTCTCC 


ACCCCATTGA 


C GTC AATGGG 


AGTTTGTTTT 


801 


GGCACCAAAA 


TCAACGGGAC 


TTTCCAAAAT 


GTCGTAACAA 


CTCCGCCCCA 


851 


TTGACGCAAA 


TGGGCGGTAG 


GCGTGTACGG 


TGGGAGGTCT 


ATATAAGCAG 


901 


AGCTCGTTTA 


GTGAACCGTC 


AGATCGCCTG 


GAGACGCCAT 


CCACGCTGTT 


951 


TTGACCTCCA 


TAGAAGACAC 


CGGGACCGAT 


CCAGCCTCCG 


CGGCCGGGAA 


1001 


CGGTGCATTG 


GAACGCGGAT 


TCCCCGTGCC 


AAGAGTGACG 


TAAGTACCGC 


1051 


CTATAGACTC 


TATAGGCACA 


CCCCTTTGGC 


TCTTATGCAT 


GCTATACTGT 


1101 


TTTTGGCTTG 


GGGCCTATAC 


ACCCCCGCTT 


CCTTATGCTA 


TAGGTGATGG 


1151 


TATAGCTTAG 


CCTATAGGTG 


TGGGTTATTG 


AGCATTATTG 


ACCACTCCCC 


1201 


TATTGGTGAC 


GATACTTTCC 


ATTACTAATC 


CATAACATGG 


CTCTTTGCCA 


1251 


CAACTATCTC 


TATTGGCTAT 


ATGCCAATAC 


TCTGTCCTTC 


AGAGACTGAC 


1301 


ACGGACTCTG 


TATTTTTACA 


GGATGGGGTC 


CCATTTATTA 


TTTACAAATT 


1351 


CACATATACA 


ACAACGCCGT 


CCCCCGTGCC 


CGCAGTTTTT 


ATTAAACATA 


1401 


GCGTGGGATC 


TCCACGCGAA 


TCTCGGGTAC 


GTGTTCCGGA 


CATGGGCTCT 


1451 


TCTCCGGTAG 


CGGCGGAGCT 


TCCACATCCG 


AGCCCTGGTC 


CCATGCCTCC 


1501 


AGCGGCTCAT 


GGTCGCTCGG 


CAGCTCCTTG 


CTCCTAACAG 


TGGAGGCCAG 


1551 


ACTTAGGCAC 


AGC AC AATGC 


CCACCACCAC 


CAGTGTGCCG 


CACAAGGCCG 


1601 


TGGCGGTAGG 


GTATGTGTCT 


GAAAATGAGC 


GTGGAGATTG 


GGCTCGCACG 


1651 


GCTGACGCAG 


ATGGAAGACT 


TAAGGCAGCG 


GCAGAAGAAG 


ATGCAGGCAG 


1701 


CTGAGTTGTT 


GTATTCTGAT 


AAGAGTCAGA 


GGTAACTCCC 


GTTGCGGTGC 


1751 


TGTTAACGGT 


GGAGGGCAGT 


GTAGTCTGAG 


CAGTACTCGT 


TGCTGCCGCG 


1801 


CGCGCCACCA 


GACATAATAG 


CTGACAGACT 


AACAGACTGT 


TCCTTTCCAT 


1851 


GGGTCTTTTC 


TGCAGTCACC 


GTCCTTAGAT 


CTAGGTACCA 


GATATCAGAA 


1901 


TTCAGTCGAC 


AGCGGCCGCG 


ATCTGCTGTG 


CCTTCTAGTT 


GCCAGCCATC 


1951 


TGTTGTTTGC 


CCCTCCCCCG 


TGCCTTCCTT 


GACCCTGGAA 


GGTGCCACTC 


2001 


CCACTGTCCT 


TTCCTAATAA 


AATGAGGAAA 


TTGCATCGCA 


TTGTC TGAGT 


2051 


AGGTGTCATT 


CTATTCTGGG 


GGGTGGGGTG 


GGGCAGGACA 


GCAAGGGGGA 
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2101 GGATTGGGAA GACAATAGCA GGCATGCTGG GGATGCGGTG GGCTCTATGG 
2151 CCGCTGCGGC CAGGTGCTGA AGAATTGACC CGGTTCCTCG TGGGCCAGAA 
2201 AGAAGCAGGC ACATCCCCTT CTCTGTGACA CACCCTGTCC ACGCCCCTGG 
2251 TTCTTAGTTC CAGCCCCACT CATAGGACAC TCATAGCTCA GGAGGGCTCC 
2 301 GCCTTCAATC CCACCCGCTA AAGTACTTGG AGCGGTCTCT CCCTCCCTCA 
2351 TCAGCCCACC AAACCAAACC TAGCCTCCAA GAGTGGGAAG AAATTAAAGC 
2401 AAGATAGGCT ATTAAGTGCA GAGGGAGAGA AAATGCCTCC AACATGTGAG 
2451 GAAGTAATGA GAGAAATC AT AGAATTTCTT CCGCTTCCTC GCTCACTGAC 
2501 TCGCTGCGCT CGGTCGTTCG GCTGCGGCGA GCGGTATCAG CTCACTCAAA 
2551 GGCGGTAATA CGGTTATCCA CAGAATCAGG GGATAACGCA GGAAAGAACA 
2 601 TGTGAGCAAA AGGCCAGCAA AAGGCCAGGA ACCGTAAAAA GGCCGCGTTG 
2651 CTGGCGTTTT TCCATAGGCT CCGCCCCCCT GACGAGCATC ACAAAAATCG 
2701 ACGCTCAAGT CAGAGGTGGC GAAACCCGAC AGGACTATAA AGATACCAGG 
2751 CGTTTCCCCC TGGAAGCTCC CTCGTGCGCT CTCCTGTTCC GACCCTGCCG 
2801 CTTACCGGAT ACCTGTCCGC CTTTCTCCCT TCGGGAAGCG TGGCGCTTTC 
2851 TCATAGCTCA CGCTGTAGGT ATCTCAGTTC GGTGTAGGTC GTTCGCTCCA 
2901 AGCTGGGCTG TGTGCACGAA CCCCCCGTTC AGCCCGACCG CTGCGCCTTA 
2951 TCCGGTAACT ATCGTCTTGA GTCCAACCCG GTAAGACACG ACTTATCGCC 
3001 ACTGGCAGCA GCCACTGGTA ACAGGATTAG CAGAGCGAGG TATGTAGGCG 
3051 GTGCTACAGA GTTCTTGAAG TGGTGGCCTA ACTACGGCTA CACTAGAAGA 
3101 ACAGTATTTG GTATCTGCGC TCTGCTGAAG CCAGTTACCT TCGGAAAAAG 
3151 AGTTGGTAGC TCTTGATCCG GCAAACAAAC CACCGCTGGT AGCGGTGGTT 
3201 TTTTTGTTTG CAAGCAGCAG ATTACGCGCA GAAAAAAAGG ATCTCAAGAA 
3251 GATCC TTTGA TCTTTTCTAC GGGGTCTGAC GCTCAGTGGA ACGAAAACTC 
3301 ACGTTAAGGG ATTTTGGTCA TGAGATTATC AAAAAGGATC TTCACCTAGA 
3351 TCCTTTTAAA TTAAAAATGA AGTTTTAAAT CAATCTAAAG TATATATGAG 
3401 TAAACTTGGT CTGACAGTTA CCAATGCTTA ATCAGTGAGG CACCTATCTC 
3451 AGCGATCTGT CTATTTCGTT CATCCATAGT TGCCTGACTC GGGGGGGGGG 
3501 GGCGCTGAGG TCTGCCTCGT GAAGAAGGTG TTGCTGACTC ATACCAGGCC 
3551 TGAATCGCCC CATCATCCAG CCAGAAAGTG AGGGAGCCAC GGTTGATGAG 
3601 AGCTTTGTTG TAGGTGGACC AGTTGGTGAT TTTGAACTTT TGCTTTGCCA 
3651 CGGAACGGTC TGCGTTGTCG GGAAGATGCG TGATCTGATC CTTCAACTCA 
3701 GCAAAAGTTC GATTTATTCA ACAAAGCCGC CGTCCCGTCA AGTCAGCGTA 
3751 ATGCTCTGCC AGTGTTACAA CCAATTAACC AATTCTGATT AGAAAAACTC 
3801 ATCGAGCATC AAATGAAACT GCAATTTATT CATATCAGGA TTATCAATAC 
3851 CATATTTTTG AAAAAGCCGT TTCTGTAATG AAGGAGAAAA CTCACCGAGG 
3901 CAGTTCCATA GGATGGCAAG ATCCTGGTAT CGGTCTGCGA TTCCGACTCG 
3951 TCCAACATCA ATACAACCTA TTAATTTCCC CTCGTCAAAA ATAAGGTTAT 
4001 CAAGTGAGAA ATCACCATGA GTGACGACTG AATCCGGTGA GAATGGCAAA 
4051 AGCTTATGCA TTTCTTTCCA GACTTGTTCA AC AGGCC AGC CATTACGCTC 
4101 GTCATCAAAA TCACTCGCAT CAACCAAACC GTTATTCATT CGTGATTGCG 
4151 CCTGAGCGAG ACGAAATACG CGATCGCTGT TAAAAGGACA ATTACAAACA 
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4201 


GGAATCGAAT 


GCAACCGGCG 


CAGGAACACT 


GCCAGCGCAT 


CAACAATATT 


4251 


TTCACCTGAA 


TCAGGATATT CTTCTAATAC 


CTGGAATGCT 


GTTTTCCCGG 


4301 


GGATCGCAGT 


GGTGAGTAAC 


CATGCATCAT 


CAGGAGTAGG 


GATAAAATGG 


4351 


TTGATGGTCG 


GAAGAGGCAT AAATTCCGTC AGCCAGTTTA GTCTGACCAT 


4401 


CTCATCTGTA 


ACATCATTGG 


CAACGGTACC 


TTTGCGATGT 


TTCAGAAACA 


4451 


ACTCTGGCGC 


ATCGGGCTTC 


CCATACAATC 


GATAGATTGT CGCACCTGfAT 


4501 


TGCCCGACAT 


TATCGCGAGC 


GCATTTATAG 


CCATATAAAT 


CAGCATCCAT 


4551 


GTTGGAATTT 


AATCGCGGGG 


TCGAGCAAGA 


GGTTTCGCGT 


TGAATATGGG 


4601 


TCATAACACC 


CCTTGTATTA ctgtttatgt aaggagagag ttttattgtt 


4651 


CATGATGATA 


tatttttatc 


TTGTGGAATG 


TAAGATCAGA 


GATTTTGAGA 


4701 


CACAAGGTGG 


CTTTCCCCCC 


CGCCCCATTA 


TTGAAGCATT 


TATCAGGGTT 


4751 


ATTGTCTCAT 


GAGCGGATAG 


ATATTTGAAT 


GTATTTAGAA 


AAATAAACAA 


4801 


ATAGGGGTTC 


GGCGCACATT 


TCCCCGAAAA 


GTGCGACCTG 


ACGTCTAAGA 


4851 


AACCATTATT 


ATGATGACAT 


TAACCTATAA 


AAATAGGCGT 


ATCAGGAGGC 


4901 


CCTTTCGTC 












WO 03/031588 



PCT/US02/32512 



42/92 

1 CATCATCAAT AATATACCTT ATTTTGGATT GAAGCCAATA TGATAATGAG GGGGTGGAGT 
61 TTGTGACGTG GCGCGGGGCG TGGGAACGGG GCGGGTGACG TAGTAGTGTG GCGGAAGTGT 
121 GATGTTGTAA GTGTGGCGGA ACACATGTAA GCGCCGGATG TGGTAAAAGT GACGTTTTTG 
181 GTGTGCGCCG GTGTACACGG GAAGTGACAA TTTTCGCGCG GTTTTAGGCG GATGTTGTAG 
241 TAAATTTGGG CGTAACCAAG TAATATTTGG CCATTTTCGC GGGAAAACTG AATAAGAGGA 
3 01 AGTGAAATCT GAATAATTCT GTGTTACTCA TAGCGCGTAA TATTTGTCTA GGGCCGCGGG 
3 61 GACTTTGACC GTTTACGTGG AGACTCGCCC AGGTGTTTTT CTCAGGTGTT TTCCGCGTTC 
421 CGGGTCAAAG TTGGCGTTTT ATTATTATAG TCAGCTGACG CGCAGTGTAT TTATACCCGG 
481 TGAGTTCCTC AAGAGGCCAC TCTTGAGTGC CAGCGAGTAG AGTTTTCTCC TCCGAGCCGC 
541 TCCGACACCG GGACTGAAAA TGAGACATAT TATCTGCCAC GGAGGTGTTA TTACCGAAGA 
601 AATGGCCGCC AGTCTTTTGG ACGAGCTGAT CGAAGAGGTA CTGGCTGATA ATCTTCCACC 
661 TCCTAGCCAT TTTGAACCAC CTACCCTTCA CGAACTGTAT GATTTAGACG TGACGGCCCC 
721 CGAAGATCCC AACGAGGAGG CGGTTTCGCA GATTTTTCCC GAGTCTGTAA TGTTGGCGGT 
781 GCAGGAAGGG ATTG AC TT AT TCACTTTTCC GCCGGCGCCC GGTTCTCCGG AGCCGCCTCA 
841 CCTTTCCCGG CAGCCCGAGC AGCCGGAGCA GAGAGC CTTG GGTCCGGTTT CTATGCCAAA 
901 CCTTGTGCCG GAGGTGATCG ATCTTACCTG CCACGAGGCT GGCTTTCCAC CCAGTGACGA 
961 CGAGGATGAA GAGGGTGAGG AGTTTGTGTT AGATTATGTG GAGCACCCCG GGCACGGTTG 
1021 CAGGTCTTGT CATTATCACC GGAGGAATAC GGGGGACCCA GATATTATGT GTTCGCTTTG 
1081 CTATATGAGG ACCTGTGGCA TGTTTGTCTA CAGTAAGTGA AAAATTATGG GCAGTGGGTG 
1141 ATAGAGTGGT GGGTTTGGTG TGGTAATTTT TTTTTTAATT TTTACAGTTT TGTGGTTTAA 
1201 AGAATTTTGT ATTGTGATTT TTTAAAAGGT CCTGTGTCTG AACCTGAGCC TGAGCCCGAG 
12 61 CCAGAACCGG AGCCTGCAAG ACCTACCCGG CGTCCTAAAT TGGTGCCTGC TATCCTGAGA 
1321 CGCCCGACAT CACCTGTGTC TAGAGAATGC AATAGTAGTA CGGATAGCTG TGACTCCGGT 
1381 CCTTCTAACA CACCTCCTGA GATACACCCG GTGGTCCCGC TGTGCCCCAT TAAACCAGTT 
1441 GCCGTGAGAG TTGGTGGGCG TCGCCAGGCT GTGGAATGTA TCGAGGACTT GCTTAACGAG 
1501 TCTGGGCAAC CTTTGGACTT GAGCTGTAAA CGCCCCAGGC CATAAGGTGT AAACCTGTGA 
1561 TTGCGTGTGT GGTTAACGCC TTTGTTTGCT GAATGAGTTG ATGTAAGTTT AATAAAGGGT 
1621 GAGATAATGT TTAACTTGCA TGGCGTGTTA AATGGGGCGG GGCTTAAAGG GTATATAATG 
1681 CGCCGTGGGC TAATCTTGGT TACATCTGAC CTCATGGAGG CTTGGGAGTG TTTGGAAGAT 
1741 TTTTCTGCTG TGCGTAACTT GCTGGAACAG AGCTCTAACA GTACCTCTTG GTTTTGGAGG 
1801 TTTCTGTGGG GCTCCTCCCA GGCAAAGTTA GTCTGCAGAA TTAAGGAGGA TTACAAGTGG 
1861 GAATTTGAAG AGCTTTTGAA ATCCTGTGGT GAGCTGTTTG ATTCTTTGAA TCTGGGTCAC 
1921 CAGGCGCTTT TCCAAGAGAA GGTCATCAAG ACTTTGGATT TTTCCACACC GGGGCGCGCT 
1981 GCGGCTGCTG TTGCTTTTTT GAGTTTTATA AAGGATAAAT GGAGCGAAGA AACCCATCTG 
2041 AGCGGGGGGT ACCTGCTGGA TTTTCTGGCC ATGCATCTGT GGAGAGCGGT GGTGAGAC AC 
2101 AAGAATCGCC TGCTACTGTT GTCTTCCGTC CGCCCGGCAA TAATACCGAC GGAGGAGCAA 
2161 CAGCAGGAGG AAGCCAGGCG GCGGCGGCGG CAGGAGCAGA GCCCATGGAA CCCGAGAGCC 
2221 GGCCTGGACC CTCGGGAATG AATGTTGTAC AGGTGGCTGA ACTGTTTCCA GAACTGAGAC 
2281 GCATTTTAAC CATTAACGAG GATGGGCAGG GGCTAAAGGG GGTAAAGAAG GAGCGGGGGG 
2341 CTTCTGAGGC TACAGAGGAG GCTAGGAATC TAACTTTTAG CTTAATGACC AGACACCGTC 
2401 CTGAGTGTGT TACTTTTCAG CAGATTAAGG ATAATTGCGC TAATGAGCTT GATCTGCTGG 
2461 CGCAGAAGTA TTCCATAGAG CAGCTGACCA CTTACTGGCT GCAGCCAGGG GATGATTTTG 
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2521 AGGAGGCTAT TAGGGTATAT GCAAAGGTGG 
2581 GCAAACTTGT AAATATCAGG AATTGTTGCT 
2641 TAGATACGGA GGATAGGGTG GCCTTTAGAT 
2701 TTGGCATGGA CGGGGTGGTT ATTATGAATG 
2761 CGGTTTTCCT GGCCAATACC AATCTTATCC 
2821 ATACCTGTGT GGAAGCCTGG ACCGATGTAA 
2881 GGAAGGGGGT GGTGTGTCGC GCCAAAAGCA 
2941 GGTGTACCTT GGGTATCCTG TCTGAGGGTA 
3001 ACTGTGGTTG CTTTATGCTA GTGAAAAGCG 
3061 GCAACTGCGA GGACAGGGCC TCTCAGATGC 
3121 TGAAGACCAT TCACGTAGCC AGCGACTCTC 
3181 TACTGACCCG CTGTTCCTTG CATTTGGGTA 
3241 GCAATTTGAG TCACACTAAG ATATTGCTTG 
3301 ACGGGGTGTT TGACATGACC ATGAAGATCT 
3361 CCAGGTGCAG ACCCTGCGAG TGTGGCGGTA 
3421 ATGTGACCGA GGAGCTGAGG CCCGATCACT 
3481 GCTCTAGCGA TGAAGATACA GATTGAGGTA 
3541 GAAAGAATAT ATAAGGTGGG GGTCTCATGT 
3601 CCATGAGCGC CAACTCGTTT GATGGAAGCA 
3661 CCCCATGGGC CGGGGTGCGT CAGAATGTGA 
3721 TGCCCGCAAA CTCTACTACC TTGACCTACG 
3781 CAGCCTCCGC CGCCGCTTCA GCCGCTGCAG 
3841 CTTTCCTGAG CCCGCTTGCA AGCAGTGCAG 
3901 TGACGGCTCT TTTGGCACAA TTGGATTC TT 
3961 AGCTGTTGGA TCTGC GCC AG CAGGTTTCTG 
4021 TTTAAAACAT AAATAAAAAC CAGACTCTGT 
4081 TCTTTATTTA GGGGTTTTGC GCGCGCGGTA 
4141 GGTCCTGTGT ATTTTTTCCA GGACGTGGTA 
4201 CATAAGCCCG TCTCTGGGGT GGAGGTAGCA 
4261 GTTGTAGATG ATCCAGTCGT AGCAGGAGCG 
4321 TAGCAAGCTG ATTGCCAGGG GCAGGCCCTT 
4381 GGATGGGTGC ATACGTGGGG ATATGAGATG 
4441 GTTCCCAGCC ATATCCCTCC GGGGATTCAT 
4501 GGTGCACTTG GGAAATTTGT CATGTAGCTT 
4561 GCCCTTGTGA CCTCCAAGAT TTTCCATGCA 
4621 GGCGGCGGCC TGGGCGAAGA TATTTC TGGG 
4681 GAGATCGTCA TAGGCCATTT TTACAAAGCG 
4741 GGTTCCATCC GGCCCAGGGG CGTAGTTACC 
4801 TTCAGATGGG GGGATCATGT CTACCTGCGG 
4861 GGAGATCAGC TGGGAAGAAA GCAGGTTCCT 
4921 CCCGTAAATG ACACCTATTA CCGGCTGCAA 
4981 ATCCCTGAGC AGGGGGGCCA CTTCGTTAAG 



CACTTAGGCC AGATTGCAAG TACAAGATTA- 
ACATTTCTGG GAACGGGGCC GAGGTGGAGA 
GTAGCATGAT AAATATGTGG CCGGGGGTGC - 
TGAGGTTTAC TGGTCCCAAT TTTAGCGGTA 
TACACGGTGT AAGCTTCTAT GGGTTTAACA 
GGGTTCGGGG CTGTGCCTTT TACTGCTGCT 
GGGCTTCAAT TAAGAAATGC CTGTTTGAAA 
ACTCCAGGGT GCGCC AC AAT GTGGCCTCCG 
TGGCTGTGAT TAAGCATAAC ATGGTGTGTG \ 
TGACCTGCTC GGACGGCAAG TGTCACTTGC 
GCAAGGCCTG GCCAGTGTTT GAGCACAACA 
ACAGGAGGGG GGTGTTCCTA CCTTACCAAT 
AGCCCGAGAG CATGTCCAAG GTGAACCTGA 
GGAAGGTGCT GAGGTACGAT GAGACCCGCA 
AACATATTAG GAACCAGCCT GTGATGCTGG 
TGGTGCTGGC CTGCACCCGC GCTGAGTTTG 
CTGAAATGTG TGGGCGTGGC TTAAGGGTGG 
AGTTTTGTAT CTGTTTTGCA GCAGCCGCCG 
TTGTGAGCTC ATATTTGACA ACGCGCATGC 
TGGGCTCCAG CATTGATGGT CGCCCCGTCC 
AGACCGTGTC TGGAACGCCG TTGGAGACTG 
CCACCGCCCG CGGGATTGTG ACTGACTTTG 
CTTCCCGTTC ATCCGCCCGC GATGACAAGT 
TGACCCGGGA ACTTAATGTC GTTTCTCAGC 
CCCTGAAGGC TTCCTCCCCT CCCAATGCGG 
TTGGATTTGG ATCAAGCAAG TGTCTTGCTG 
GGCCCGGGAC CAGCGGTCTC GGTCGTTGAG 
AAGGTGACTC TGGATGTTCA GATACATGGG 
CCACTGCAGA GCTTCATGCT GCGGGGTGGT 
CTGGGCGTGG TGCCTAAAAA TGTCTTTCAG 
GGTGTAAGTG TTTACAAAGC GGTTAAGCTG 
CATCTTGGAC TGTATTTTTA GGTTGGCTAT 
GTTGTGCAGA ACCACCAGCA CAGTGTATCC 
AGAAGGAAAT GCGTGGAAGA ACTTGGAGAC 
TTCGTCCATA ATGATGGCAA TGGGCCCACG 
ATCACTAACG TCATAGTTGT GTTCCAGGAT 
CGGGCGGAGG GTGCCAGACT GCGGTATAAT 
CTCACAGATT TGCATTTCCC ACGCTTTGAG 
GGCGATGAAG AAAACCGTTT CCGGGGTAGG 
AAGCAGCTGC GACTTACCGC AGCCGGTGGG 
CTGGTAGTTA AGAGAGCTGC AGCTGCCGTC; 
CATGTCCCTG ACTTGCATGT TTTCCCTGAC 
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5041 CAAATCCGCC AGAAGGCGCT CGCCGCCCAG CGATAGCAGT TCTTGCAAGG AAGCAAAGTT 
5101 TTTCAACGGT TTGAGGCCGT CCGCCGTAGG CATGCTTTTG AGCGTTTGAC CAAGCAGTTC 
5161 CAGGCGGTCC CACAGCTCGG TCACGTGCTC TACGGCATCT CGATCCAGCA TATCTCCTCG 
5221 TTTCGCGGGT TGGGGCGGCT TTCGCTGTAC GGCAGTAGTC GGTGCTCGTC CAGACGGGCC 
5281 AGGGTCATGT CTTTCCACGG GCGCAGGGTC CTCGTCAGCG TAGTCTGGGT CACGGTGAAG 
5341 GGGTGCGCTC CGGGTTGCGC GCTGGCCAGG GTGCGCTTGA GGCTGGTCCT GCTGGTGCTG 
5401 AAGCGCTGCC GGTCTTCGCC CTGCGCGTCG GCCAGGTAGC ATTTGACCAT GGTGTCATAG 
5461 TCCAGCCCCT CCGCGGCGTG GCCCTTGGCG CGCAGCTTGC CCTTGGAGGA GGCGCCGCAC 
5521 GAGGGGCAGT GCAGACTTTT AAGGGCGTAG AGCTTGGGCG CGAGAAATAC CGATTCCGGG 
5581 GAGTAGGCAT CCGCGCCGCA GGCCCCGCAG ACGGTCTCGC ATTCCACGAG CCAGGTGAGC 
5641 TCTGGCCGTT CGGGGTCAAA AACCAGGTTT CCCCCATGCT TTTTGATGCG TTTCTTACCT 
5701 CTGGTTTCCA TGAGCCGGTG TCCACGCTCG GTGACGAAAA GGCTGTCCGT GTCCCCGTAT 
5761 ACAGACTTGA GAGGCCTGTC CTCGAGCGGT GTTCCGCGGT CCTCCTCGTA TAGAAACTCG 
5821 GACCACTCTG AGACGAAGGC TCGCGTCCAG GCCAGCACGA AGGAGGCTAA GTGGGAGGGG 
5881 TAGCGGTCGT TGTCCACTAG GGGGTCCACT CGCTCCAGGG TGTGAAGACA CATGTCGCCC 
5941 TCTTCGGCAT CAAGGAAGGT GATTGGTTTA TAGGTGTAGG CCACGTGACC GGGTGTTCCT : 
6001 GAAGGGGGGC TATAAAAGGG GGTGGGGGCG CGTTCGTCCT CACTCTCTTC CGCATCGCTG 
6061 TCTGCGAGGG CCAGCTGTTG GGGTGAGTAC TCCCTCTCAA AAGCGGGCAT GACTTCTGCG 
6121 CTAAGATTGT CAGTTTCCAA AAACGAGGAG GATTTGATAT TCACCTGGCC CGCGGTGATG 
6181 CCTTTGAGGG TGGCCGCGTG CATCTGGTCA GAAAAGACAA TCTTTTTGTT GTCAAGCTTG 
6241 GTGGCAAACG ACCCGTAGAG GGCGTTGGAC AGCAACTTGG CGATGGAGCG CAGGGTTTGG 
6301 TTTTTGTCGC GATCGGCGCG CTCCTTGGCC GCGATGTTTA GCTGCACGTA TTCGCGCGCA 
6361 ACGCACCGCC ATTCGGGAAA GACGGTGGTG CGCTCGTCGG GCACTAGGTG CACGCGCCAA 
6421 CCGCGGTTGT GCAGGGTGAC AAGGTCAACG CTGGTGGCTA CCTCTCCGCG TAGGCGCTCG 
6481 TTGGTCCAGC AGAGGCGGCC GCCCTTGCGC GAGCAGAATG GCGGTAGTGG GTCTAGCTGC 
6541 GTCTCGTCCG GGGGGTCTGC GTC CACGGTA AAGACCCCGG GCAGCAGGCG CGCGTCGAAG 
6601 TAGTCTATCT TGCATCCTTG CAAGTCTAGC GCCTGCTGCC ATGCGCGGGC GGCAAGCGCG 
6661 CGCTCGTATG GGTTGAGTGG GGGACCCCAT GGCATGGGGT GGGTGAGCGC GGAGGCGTAC 
6721 ATGCCGCAAA TGTCGTAAAC GTAGAGGGGC TCTCTGAGTA TTCCAAGATA TGTAGGGTAG 
6781 CATCTTCCAC CGCGGATGCT GGCGCGCACG TAATCGTATA GTTCGTGCGA GGGAGCGAGG 
6841 AGGTCGGGAC CGAGGTTGCT ACGGGCGGGC TGCTCTGCTC GGAAGACTAT CTGCCTGAAG 
6901 ATGGCATGTG AGTTGGATGA TATGGTTGGA CGCTGGAAGA CGTTGAAGCT GGCGTCTGTG 
6961 AGACCTACCG CGTCACGCAC GAAGGAGGCG TAGGAGTCGC GCAGCTTGTT GACCAGCTCG 
7021 GCGGTGACCT GCACGTCTAG GGCGCAGTAG TCCAGGGTTT CCTTGATGAT GTCATACTTA 
7081 TCCTGTCCCT TTTTTTTCCA CAGCTCGCGG TTGAGGACAA ACTCTTCGCG GTCTTTCCAG 
7141 TACTCTTGGA TCGGAAACCC GTCGGCCTCC GAACGGTAAG AGCCTAGCAT GTAGAACTGG 
7201 TTGACGGCCT GGTAGGCGCA GCATCCCTTT TCTACGGGTA GCGCGTATGC CTGCGCGGCC 
7261 TTCCGGAGCG AGGTGTGGGT GAGCGCAAAG GTGTCCCTAA CCATGACTTT GAGGTACTGG 
7321 TATTTGAAGT CAGTGTCGTC GCATCCGCCC TGCTCCC AGA GCAAAAAGTC CGTGCGCTTT 
7381 TTGGAACGCG GGTTTGGCAG GGCGAAGGTG ACATCGTTGA AGAGTATCTT TCCCGCGCGA 
7441 GGCATAAAGT TGCGTGTGAT GCGGAAGGGT CCCGGCACCT CGGAACGGTT GTTAATTACC 
7501 TGGGCGGCGA GCACGATCTC GTCAAAGCCG TTGATGTTGT GGCCCACAAT GTAAAGTTCC 
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7561 AAGAAGCGCG GGATGCCCTT GATGGAAGGC AATTTTTTAA GTTCCTCGTA GGTGAGfcTCT 
7621 TCAGGGGAGC TGAGCCCGTG CTCTGAAAGG GCCCAGTCTG CAAGATGAGG GTTGGAAGCG 
7681 ACGAATGAGC TCCACAGGTC ACGGGCCATT AGCATTTGCA GGTGGTCGCG AAAGGTCCTA 
7741 AACTGGCGAC CTATGGCGAT TTTTTCTGGG GTGATGCAGT AGAAGGTAAG CGGGTCTTGT 
7801 TCCCAGCGGT CCCATCCAAG GTCCGCGGCT AGGTCTCGCG CGGCGGTCAC TAGAGGCTCA 
7861 TCTCCGCCGA ACTTCATGAC CAGCATGAAG GGCACGAGCT GCTTCCCAAA GGCCCCCATC ? 
7921 CAAGTATAGG TCTCTACATC GTAGGTGACA AAGAGACGCT CGGTGCGAGG ATGCGAGCCG 
7981 ATCGGGAAGA ACTGGATCTC CCGCCACCAG TTGGAGGAGT GGCTGTTGAT GTGGTGAAAG 
8041 TAGAAGTCCC TGCGACGGGC CGAACACTCG TCCTGGCTTT TGTAAAAACG TGCGCAGTAC 
8101 TGGCAGCGGT GCACGGGCTG TACATCCTGC ACGAGGTTGA CCTGACGACC GCGCACAAGG 
8161 AAGCAGAGTG GGAATTTGAG CCCCTCGCCT GGCGGGTTTG GCTGGTGGTC TTCTACTTCG 
8221 GCTGCTTGTC CTTGACCGTC TGGCTGCTCG AGGGGAGTTA CGGTGGATCG GACCACCACG 
8281 CCGCGCGAGC CCAAAGTCCA GATGTCCGCG CGCGGCGGTC GGAGCTTGAT GACAACATCG 
8341 CGCAGATGGG AGCTGTCCAT GGTCTGGAGC TCCCGCGGCG TCAGGTCAGG CGGGAGCTCC 
8401 TGCAGGTTTA CCTCGCATAG CCGGGTCAGG GCGCGGGCTA GGTCCAGGTG ATACCTGATT 
8461 TCCAGGGGCT GGTTGGTGGC GGCGTC GATG GCTTGCAAGA GGCCGCATCC CCGCGGCGCG 
8521 ACTAC GGTAC CGCGCGGCGG GCGGTGGGCC GCGGGGGTGT CCTTGGATGA TGCATCTAAA 
8581 AGC GGTGACG CGGGCGGGCC CCCGGAGGTA GGGGGGGCTC GGGACCCGCC GGGAGAGGGG 
8641 GCAGGGGCAC GTCGGCGCCG CGCGCGGGCA GGAGCTGGTG CTGCGCGCGG AGGTTGCTGG 
87 01 CGAACGCGAC GACGCGGCGG TTGATCTCCT GAATCTGGCG CCTCTGCGTG AAGACGACGG 
87 61 GCCCGGTGAG CTTGAACCTG AAAGAGAGTT CGACAGAATC AATTTCGGTG TCGTTGACGG 
8821 CGGCCTGGCG CAAAATCTCC TGCACGTCTC CTGAGTTGTC TTGATAGGCG ATCTCGGCCA 
8881 TGAACTGCTC GATCTCTTCC TCCTGGAGAT CTCCGCGTCC GGCTCGCTCC ACGGTGGCGG 
8941 CGAGGTCGTT GGAGATGCGG GCCATGAGCT GCGAGAAGGC GTTGAGGCCT CCCTCGTTCC 
9001 AGACGCGGCT GTAGACCACG CCCCCTTCGG CATCGCGGGC GCGCATGACC AC CTGCGCGA 
9061 GATTGAGCTC CACGTGCCGG GCGAAGACGG CGTAGTTTCG CAGGCGC TGA AAGAGGTAGT 
9121 TGAGGGTGGT GGCGGTGTGT TCTGCCACGA AGAAGTACAT AACCCAGCGC CGCAACGTGG 
9181 ATTCGTTGAT ATCCCCCAAG GCCTCAAGGC GCTCCATGGC CTCGTAGAAG TCCACGGCGA 
9241 AGTTGAAAAA CTGGGAGTTG CGCGCCGACA CGGTTAACTC CTCCTCCAGA AGACGGATGA 
9301 GCTCGGCGAC AGTGTCGCGC ACCTCGCGCT CAAAGGCTAC AGGGGCCTCT TC TTCTTCTT 
9361 CAATCTCCTC TTCCATAAGG GCCTCCCCTT CTTCTTCTTC TGGCGGCGGT GGGGGAGGGG 
9421 GGACACGGCG GCGACGACGG CGCACCGGGA GGCGGTCGAC AAAGCGCTCG ATCATCTCCC 
9481 CGCGGCGACG GCGCATGGTC TCGGTGACGG CGCGGCCGTT CTCGCGGGGG CGCAGTTGGA 
9541 AGACGCCGCC CGTCATGTCC CGGTTATGGG TTGGCGGGGG GCTGCCGTGC GGCAGGGATA 
9601 CGGCGCTAAC GATGCATCTC AACAATTGTT GTGTAGGTAC TCCGCCACCG AGGGACCTGA 
9661 GCGAGTCCGC ATCGACCGGA TCGGAAAACC TCTCGAGAAA GGCGTCTAAC CAGTCACAGT 
9721 CGCAAGGTAG GCTGAGCACC GTGGCGGGCG GCAGCGGGCG GCGGTCGGGG TTGTTTCTGG 
9781 CGGAGGTGCT GCTGATGATG TAATTAAAGT AGGCGGTCTT GAGACGGCGG ATGGTCGACA 
9841 GAAGCACCAT GTCCTTGGGT CCGGCCTGCT GAATGCGCAG GCGGTCGGCC ATGCCCCAGG 
9901 CTTCGTTTTG ACATCGGCGC AGGTCTTTGT AGTAGTCTTG CATGAGCCTT TCTACCGGCA 
9961 CTTCTTCTTC TCCTTCCTCT TGTCCTGCAT CTCTTGCATC TATCGCTGCG GCGGCGGCGG 
10021 AGTTTGGCCG TAGGTGGCGC CCTCTTCCTC CCATGCGTGT GACCCCGAAG CCCCTCATCG 
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10081 GCTGAAGCAG GGCCAGGTCG GCGACAACGC GCTCGGCTAA TATGGCCTGC TGCACCTGCG 
10141 TGAGGGTAGA CTGGAAGTCG TCCATGTCCA CAAAGCGGTG GTATGCGCCC GTGTTGATGG 
10201 TGTAAGTGCA GTTGGCCATA ACGGACCAGT TAACGGTCTG GTGACCCGGC TGCGAGAGCT 
10261 CGGTGTACCT GAGACGCGAG TAAGCCCTTG AGTCAAAGAC GTAGTCGTTG CAAGTCCGCA 
10321 CCAGGTACTG GTATCCCACC AAAAAGTGCG GCGGCGGCTG GCGGTAGAGG GGCCAGCGTA 
10381 GGGTGGCCGG GGCTCCGGGG GCGAGGTCTT CCAACATAAG GCGATGATAT CCGTAGATGT 
10441 ACCTGGACAT CCAGGTGATG CCGGCGGCGG TGGTGGAGGC GCGCGGAAAG TCACGGACGC 
10501 GGTTCCAGAT GTTGCGCAGC GGCAAAAAGT GCTCCATGGT CGGGACGCTC TGGCCGGTCA 
10561 GGCGCGCGCA GTCGTTGACG CTCTAGACCG TGCAAAAGGA GAGCCTGTAA GCGGGC ACTC 
10621 TTCCGTGGTC TGGTGGATAA ATTCGCAAGG GTATCATGGC GGACGACCGG GGTTCGAACC 
10681 CCGGATCCGG CCGTCCGCCG TGATCCATGC GGTTACCGCC CGCGTGTCGA ACCCAGGTGT 
10741 GCGACGTCAG ACAACGGGGG AGCGCTCCTT TTGGCTTCCT TCCAGGCGCG GCGGATGCTG 
10801 CGCTAGCTTT TTTGGCCACT GGCCGCGCGC GGCGTAAGCG GTTAGGCTGG AAAGCGAAAG : 
10861 CATTAAGTGG CTCGCTCCCT GTAGCCGGAG GGTTATTTTC CAAGGGTTGA GTCGCGGGAC 
10921 CCCCGGTTCG AGTCTCGGGC CGGCCGGACT GCGGCGAACG GGGGTTTGCC TCCCCGTCAT 
10981 GCAAGACCCC GCTTGCAAAT TCCTCCGGAA ACAGGGACGA GCCCCTTTTT TGCTTTTCCC 
11041 AGATGCATCC GGTGCTGCGG CAGATGCGCC CCCCTCCTCA GCAGCGGCAA GAGCAAGAGC 
11101 AGCGGCAGAC ATGCAGGGCA CCCTCCCCTT CTCCTACCGC GTCAGGAGGG GCAACATCCG 
11161 CGGCTGACGC GGCGGCAGAT GGTGATTACG AACCCCCGCG GCGCCGGACC CGGCACTACT 
11221 TGGACTTGGA GGAGGGCGAG GGCCTGGCGC GGCTAGGAGC GCCCTCTCCT GAGCGACACC 
11281 CAAGGGTGCA GCTGAAGCGT GACACGCGCG AGGCGTACGT GCCGCGGCAG AACCTGTTTC 
11341 GCGACCGCGA GGGAGAGGAG CCCGAGGAGA TGCGGGATCG AAAGTTCCAT GCAGGGC GCG 
11401 AGTTGCGGCA TGGCCTGAAC CGCGAGCGGT TGCTGCGCGA GGAGGACTTT GAGCCCGACG 
11461 CGCGGACCGG GATTAGTCCC GCGCGCGCAC ACGTGGCGGC CGCCGACCTG GTAACCGCGT 
11521 ACGAGCAGAC GGTGAACCAG GAGATTAACT TTCAAAAAAG CTTTAACAAC CACGTGCGCA 
11581 CGCTTGTGGC GCGCGAGGAG GTGGCTATAG GACTGATGCA TCTGTGGGAC TTTGTAAGCG 
11641 CGCTGGAGCA AAACCCAAAT AGCAAGCCGC TCATGGCGCA GCTGTTCCTT ATAGTGCAGC 
11701 ACAGCAGGGA CAACGAGGCA TTCAGGGATG CGCTGCTAAA CATAGTAGAG CCCGAGGGCC 
11761 GCTGGCTGCT CGATTTGATA AACATTCTGC AGAGCATAGT GGTGCAGGAG CGCAGCTTGA 
11821 GCCTGGCTGA CAAGGTGGCC GCCATTAACT ATTCCATGCT CAGTCTGGGC AAGTTTTACG 
11881 CCCGCAAGAT ATACCATACC CCTTACGTTC CCATAGACAA GGAGGTAAAG ATCGAGGGGT 
A 11941 TCTACATGCG CATGGCGCTG AAGGTGCTTA CCTTGAGCGA CGACCTGGGC GTTTATCGCA 
12001 ACGAGCGCAT CCACAAGGCC GTGAGCGTGA GCCGGCGGCG CGAGCTCAGC GACCGCGAGC 
12061 TGATGCACAG CCTGCAAAGG GCCCTGGCTG GCACGGGCAG CGGCGATAGA GAGGCCGAGT 
12121 CCTACTTTGA CGCGGGCGCT GACCTGCGCT GGGCCCCAAG CCGACGCGCC CTGGAGGCAG 
12181 CTGGGGCCGG ACCTGGGCTG GCGGTGGCAC CCGCGCGCGC TGGCAACGTC GGCGGCGTGG 
12241 AGGAATATGA CGAGGACGAT GAGTACGAGC CAGAGGACGG CGAGTACTAA GCGGTGATGT 
12301 TTCTGATCAG ATGATGCAAG ACGCAACGGA CCCGGCGGTG CGGGCGGCGC TGCAGAGCCA 
12361 GCCGTCCGGC CTTAACTCCA CGGACGACTG GCGCCAGGTC ATGGACC GC A TCATGTCGCT 
12421 GACTGCGCGC AACCCTGACG CGTTCCGGCA GCAGCCGCAG GCCAACCGGC TCTCCGCAAT 
12481 TCTGGAAGCG GTGGTCCCGG CGCGCGCAAA CCCCACGCAC GAGAAGGTGC TGGCGATCGT 
12541 AAACGCGCTG GCC GAAAAC A GGGCCATCCG GCCCGATGAG GCCGGCCTGG TCTACGACGC 
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12601 GCTGCTTCAG CGCGTGGCTC GTTACAACAG CAGCAACGTG CAGACCAACC TGGACCGGCT 
12661 GGTGGGGGAT GTGCGCGAGG CCGTGGCGCA GCGTGAGCGC GCGCAGCAGC AGGGCAACCT 
12721 GGGCTCCATG GTTGCACTAA ACGCCTTCCT GAGTACACAG CCCGCCAACG TGCCGCGGGG 
12781 ACAGGAGGAC TACACCAACT TTGTGAGCGC ACTGCGGCTA ATGGTGACTG AGACACCGCA 
12841 AAGTGAGGTG TATCAGTCCG GGCCAGACTA TTTTTTCCAG ACCAGTAGAC AAGGCCTGCA 
12901 GACCGTAAAC CTGAGCCAGG CTTTCAAGAA CTTGCAGGGG CTGTGGGGGG TGCGGGCTCC 
12961 CACAGGCGAC CGCGCGACCG TGTCTAGCTT GCTGACGCCC AACTCGCGCC TGTTGCTGCT 
13021 GCTAATAGCG CCCTTCACGG ACAGTGGCAG CGTGTCCCGG GACACATACC TAGGTCACTT 
13081 GCTGACACTG TACCGCGAGG CCATAGGTCA GGCGCATGTG GACGAGCATA CTTTCCAGGA 
13141 GATTACAAGT GTTAGCCGCG CGCTGGGGCA GGAGGACACG GGCAGCCTGG AGGCAACCCT 
13201 GAACTACCTG CTGACCAACC GGCGGCAAAA AATCCCCTCG TTGCACAGTT TAAACAGCGA 
13261 GGAGGAGCGC ATTTTGCGCT ATGTGCAGCA GAGCGTGAGC CTTAACCTGA TGCGCGACGG 
13321 GGTAACGCCC AGCGTGGCGC TGGACATGAC CGCGCGCAAC ATGGAACCGG GCATGTATGC 
13381 CTCAAACCGG CCGTTTATCA ATC GCCTAAT GGACTACTTG CATCGCGCGG CCGCCGTGAA 
13441 CCCCGAGTAT TTCACCAATG CCATCTTGAA CCCGCACTGG CTACCGCCCC CTGGTTTCTA 
13501 CACCGGGGGA TTCGAGGTGC GCGAGGGTAA CGATGGATTC CTCTGGGACG ACATAGACGA 
13561 CAGCGTGTTT TCCCCGCAAC CGCAGACCCT GCTAGAGTTG CAACAACGCG AGCAGGCAGA 
13621 GGCGGCGCTG GGAAAGGAAA GCTTCCGCAG GCCAAGCAGC TTGTCCGATC TAGGCGCTGC 
13681 GGCCCCGCGG TCAGATGCTA GTAGCCC ATT TCCAAGCTTG ATAGGGTCTC TTACCAGCAC 
13741 TCGCACCACC CGCCCGCGCC TGCTGGGCGA GGAGGAGTAC CTAAACAACT CGC TGCTGC A 
13801 GCCGCAGCGC GAAAAGAACC TGCCTCCGGC GTTTCCCAAC AACGGGATAG AGAGCCTAGT 
13861 GGACAAGATG AGTAGATGGA AGACGTATGC GCAGGAGCAC AGGGATGTGC CCGGCCCGCG 
13921 CCCGCCCACC CGTCGTCAAA GGCACGACCG TCAGCGGGGT CTGGTGTGGG AGGACGATGA 
13981 CTCGGCAGAC GACAGCAGCG TCTTGGATTT GGGAGGGAGT GGCAACCCGT TTGCACACCT 
14041 TCGCCCCAGG CTGGGGAGAA TGTTTTAAAA AAAGCATGAT GCAAAATAAA AAACTCACCA 
14101 AGGCCATGGC ACCGAGCGTT GGTTTTCTTG TATTGCC CTT AGTATGCGGC GCGCGGCGAT 
14161 GTATGAGGAA GGTCCTCCTC CCTCCTACGA GAGC GTGGTG AGCGCGGC GC CAGTGGCGGC 
14221 GGCGCTGGGT TCACCCTTCG ATGCTCCCCT GGACCCGCCG TTCGTGCCTC CGCGGTACCT 
14281 GCGGCCTACC GGGGGGAGAA ACAGCATCCG TTACTCTGAG TTGGCACCCC TATTCGACAC 
14341 CACCCGTGTG TACCTTGTGG ACAACAAGTC AACGGATGTG GCATCCCTGA ACTACCAGAA 
14401 CGACCACAGC AACTTTCTAA CCACGGTCAT TCAAAACAAT GACTACAGCC CGGGGGAGGC 
14461 AAGCACACAG ACCATCAATC TTGACGACCG GTCGCACTGG GGCGGCGACC TGAAAACCAT 
14521 CCTGCATACC AACATGCCAA ATGTGAACGA GTTCATGTTT AC C AATAAGT TTAAGGCGCG 
14581 GGTGATGGTG TCGCGCTCGC TTACTAAGGA CAAACAGGTG GAGCTGAAAT ACGAGTGGGT 
14641 GGAGTTCACG CTGCCCGAGG GCAACTACTC CGAGACCATG ACCATAGACC TTATGAACAA 
14701 CGCGATCGTG GAGCACTACT TGAAAGTGGG CAGGCAGAAC GGGGTTCTGG AAAGCGACAT 
14761 CGGGGTAAAG TTTGACACCC GCAACTTCAG ACTGGGGTTT GACCCAGTCA CTGGTCTTGT 
14821 CATGCCTGGG GTATATACAA ACGAAGCCTT CCATCCAGAC ATCATTTTGC TGCCAGGATG 
14881 CGGGGTGGAC TTCACCCACA GCCGCCTGAG CAACTTGTTG GGCATCCGCA AGCGGCAACC 
14941 CTTCCAGGAG GGCTTTAGGA TCACCTACGA TGACCTGGAG GGTGGTAACA TTCCCGCACT 
15001 GTTGG ATGTG GACGCCTACC AGGCAAGCTT GAAAGATGAC ACCGAACAGG GCGGGGGTGG 
15061 CGCAGGCGGC GGCAACAACA GTGGCAGCGG CGCGGAAGAG AACTCCAACG CGGCAGCTGC 
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15121 GGCAATGCAG CCGGTGGAGG ACATGAACGA TCATGCCATT CGCGGCGACA CCTTTGCCAC 
15181 ACGGGCGGAG GAGAAGCGCG CTGAGGCCGA GGCAGCGGCC GAAGCTGCCG CGGCCGCTGC 
15241 GGAGGCTGCA CAACCCGAGG TCGAGAAGCC TCAGAAGAAA CCGGTGATTA AACCCCTGAC 
153 01 AGAGGACAGC AAGAAACGCA GTTACAACCT AATAAGCAAT GACAGCACCT TCACCCAGTA 
15361 CCGCAGCTGG TACCTTGCAT ACAACTACGG CGACCCTCAG GCCGGGATCC GCTCATGGAC 
15421 CCTGCTTTGC ACTCCTGACG TAACCTGCGG CTCGGAGCAG GTATACTGGT CGTTGCCCGA 
15481 CATGATGCAA GACCCCGTGA CCTTCCGCTC CACGCGCCAG ATCAGCAACT TTCCGGTGGT 
15541 GGGCGCCGAG CTGTTGCCCG TGCACTCCAA GAGCTTCTAC AACGACCAGG CCGTCTACTC 
15601 CCAGCTCATC CGCCAGTTTA CCTCTCTGAC CCACGTGTTC AATCGCTTTC CCGAGAACCA 
15661 GATTTTGGCG CGCCCGCCAG CCCCCACCAT CACCACCGTC AGTGAAAACG TTCCTGCTCT 
15721 CACAGATCAC GGGACGCTAC CGCTGCGCAA CAGCATCGGA GGAGTCCAGC GAGTGACCAT 
15781 TACTGACGCC AGACGCCGCA CCTGCCCCTA CGTTTACAAG GCCCTGGGCA TAGTCTCGCC 
15841 GCGCGTCCTA TCGAGCCGCA CTTTTTGAGC AAGCATGTCC ATCCTTATAT CGCCCAGCAA 
15901 TAACACAGGC TGGGGCCTGC GCTTCCCAAG CAAGATGTTT GGCGGGGCCA AGAAGCGCTC 
15961 CGACCAACAC CCAGTGCGCG TGCGCGGGCA CTACCGCGCG CCCTGGGGCG CGCACAAACG 
16021 CGGCCGCACT GGGCGCACCA CCGTCGATGA CGCCATCGAC GCGGTGGTGG AGGAGGCGCG 
16081 CAACTACACG CCCACGCCGC CGCCAGTGTC CACCGTGGAC GCGGCCATTC AGACCGTGGT 
16141 GCGCGGAGCC CGGCGCTACG CTAAAATGAA GAGACGGCGG AGGCGCGTAG CACGTCGCCA 
16201 CCGCCGCCGA CCCGGCACTG CCGCCCAACG CGCGGCGGCG GCCCTGCTTA ACCGCGCACG 
16261 TCGCACCGGC CGACGGGCGG CCATGCGAGC CGCTCGAAGG CTGGCCGCGG GTATTGTCAC 
16321 TGTGCCCCCC AGGTCCAGGC GACGAGCGGC CGCCGCAGCA GCCGCGGCCA TTAGTGCTAT 
16381 GACTCAGGGT CGCAGGGGCA ACGTGTACTG GGTGCGCGAC TCGGTTAGCG GGCTGCGCGT 
16441 GCCCGTGCGC ACCCGCCCCC CGCGCAACTA GATTGCAATA AAAAAC TACT TAGACTCGTA 
16501 CTGTTGTATG TATCCAGCGG CGGCGGCGCG CATCGAAGCT ATGTCCAAGC GCAAAATCAA 
16561 AGAAGAGATG CTCCAGGTCA TCGCGCCGGA GATCTATGGC CCCCCGAAGA AGGAAGAGC A 
16621 GGATTACAAG CCCCGAAAGC TAAAGCGGGT CAAAAAGAAA AAGAAAGATG ATGATGATGA 
16681 TGAACTTGAC GACGAGGTGG AACTGTTGCA CGCGACCGCG CCCAGGCGAC GGGTAC AGTG 
16741 GAAAGGTCGA CGCGTAAGAC GTGTTTTGCG ACCCGGCACC ACCGTAGTCT TTACGCCCGG 
16801 TGAGCGCTCC ACCCGCACCT ACAAGCGCGT GTATGATGAG GTGTACGGCG ACGAGGACCT 
16861 GCTTGAGCAG GCCAACGAGC GCCTCGGGGA GTTTGCCTAC GGAAAGCGGC ATAAGGACAT 
16921 GCTGGCGTTG CCGCTGGACG AGGGCAACCC AACACCTAGC CTAAAGCCCG TGACACTGCA 
A 16981 GCAGGTGCTG CCCGCGCTTG CACCGTCCGA AGAAAAGCGC GGCCTAAAGC GCGAGTCTGG 
17041 TGACTTGGCA CCCACCGTGC AGCTGATGGT ACCCAAGCGT CAGCGACTGG AAGATGTCTT 
17101 GGAAAAAATG ACCGTGGAGC CTGGGCTGGA GCCCGAGGTC CGCGTGCGGC CAATCAAGCA 
17161 GGTGGCACCG GGACTGGGCG TGCAGACCGT GGACGTTCAG ATACCCACCA CCAGTAGCAC 
17221 TAGTATTGCC ACTGCCACAG AGGGCATGGA GACACAAACG TCCCCGGTTG CCTCGGCGGT 
17281 GGCAGATGCC GCGGTGCAGG CGGCCGCTGC GGCCGCGTCC AAGACCTCTA CGGAGGTGCA 
17341 AACGGACCCG TGGATGTTTC GTGTTTCAGC CCCCCGGCGT CCGCGCCGTT CAAGGAAGTA 
17401 CGGCGCCGCC AGCGCGCTAC TGCCCGAATA TGCCCTACAT CCTTCCATCG CGCCTACCCC 
17461 CGGCTATCGT GGCTACACCT ACCGCCCCAG AAGACGAGCA ACTACCCGAC GCCGAACCAC 
17521 CACTGGAACC CGCCGCCGCC GTCGCCGTCG CCAGCCCGTG CTGGCCCCGA TTTCCGTGCG 
17581 CAGGGTGGCT CGCGAAGGAG GCAGGACCCT GGTGCTGCCA ACAGCGCGCT ACCACCCCAG 
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17641 CATCGTTTAA AAGCC GGTCT TTGTGGTTCT TGCAGATATG GCCCTCACCT GCCGCCTCCG 
i7701 TTTCCCGGTG CCGGGATTCC GAGGAAGAAT GCACCGTAGG AGGGGCATGG CCGGCCACGG 
17761 CCTGACGGGC GGCATGCGTC GTGCGCACCA CCGGCGGCGG CGCGCGTCGC ACCGTCGCAT 
17821 GCGCGGCGGT ATCCTGCCCC TCCTTATTCC ACTGATCGCC GCGGCGATTG GCGCCGTGCC 
17881 CGGAATTGCA TCCGTGGCCT TGCAGGCGCA GAGACACTGA TTAAAAACAA GTTACATGTG 
17941 GAAAAATCAA AATAAAAGTC TGGACTCTCA CGCTCGCTTG GTCCTGTAAC TATTTTGTAG 
18001 AATGGAAGAC ATCAACTTTG CGTCACTGGC CCCGCGACAC GGCTCGCGCC CGTTCATGGG 
iiB061 AAACTGGCAA GATATCGGCA CCAGCAATAT GAGCGGTGGC GCCTTCAGCT GGGGCTCGCT 
18121 GTGGAGCGGC ATTAAAAATT TCGGTTCCGC CGTTAAGAAC TATGGCAGCA AAGCCTGGAA 
18181 CAGCAGCACA GGCCAGATGC TGAGGG ACAA GTTGAAAGAG CAAAATTTCC AACAAAAGGT 
18241 GGTAGATGGC CTGGCCTCTG GCATTAGGGG GGTGGTGGAC CTGGCCAACC AGGCAGTGCA 
18301 AAATAAGATT AACAGTAAGC TTGATCCCCG CCCTCCCGTA GAGGAGCCTC CACCGGCCGT 
18361 GGAGACAGTG TCTCCAGAGG GGCGTGGCGA AAAGCGTCCG CGACCCGACA GGGAAGAAAC 
18421 TCTGGTGACG CAAATAGACG AGCCTCCCTC GTACGAGGAG GCACTAAAGC AAGGCCTGCC 
18481 CACCACCCGT CCCATCGCGC CCATGGCTAC CGGAGTGCTG GGCCAGCACA CACCCGTAAC 
18541 GCTGGACCTG CCTCCCCCCG CCGACACCCA GCAGAAACCT GTGCTGCCAG GCCCGTCCGC 
18601 CGTTGTTGTA ACCCGTCCTA GCCGCGCGTC CCTGCGCCGC GCCGCCAGCG GTCCGCGATC 
i8661 GTTGCGGCCC GTAGCCAGTG GCAACTGGCA AAGCACACTG AACAGC ATCG TGGGTTTGGG 
18721 GGTGCAATCC CTGAAGCGCC GACGATGCTT CTGATAGCTA ACGTGTCGTA TGTGTGTCAT 
18781 GTATGCGTCC ATGTCGCCGC CAGAGGAGCT GCTGAGCCGC CGCGCGCCCG CTTTCCAAGA 
18841 TGGCTACCCC TTCGATGATG CCGCAGTGGT CTTACATGCA CATCTCGGGC CAGGACGCCT 
18901 CGGAGTACCT GAGCCCCGGG CTGGTGCAGT TCGCCCGCGC CACCGAGACG TACTTCAGCC 
18961 TGAATAACAA GTTTAGAAAC CCCACGGTGG CGCCTACGCA CGACGTGACC ACAGACCGGT 
19021 CTCAGCGTTT GACGCTGCGG TTCATCCCCG TGGACCGCGA GGATACTGCG TACTCGTACA 
19081 AGGCGCGGTT CACCCTAGCT GTGGGTGATA ACCGTGTGCT AGACATGGCT TCCACGTACT 
19141 TTGACATCCG CGGCGTGCTG GACAGGGGCC CTACTTTTAA GCCCTACTCT GGCACTGCCT 
19201 ACAACGCACT GGCCCCCAAG GGTGCCCCCA ACTCGTGCGA GTGGGAACAA AATGAAACTG 
19261 CACAAGTGGA TGCTCAAGAA CTTGACGAAG AGGAGAATGA AGCCAATGAA GCTCAGGCGC 
19321 GAGAACAGGA ACAAGCTAAG AAAACCCATG TATATGC CC A GGCTCCACTG TCCGGAATAA 
19381 AAATAACTAA AGAAGGTCTA CAAATAGGAA CTGCCGACGC CACAGTAGCA GGTGCCGGCA 
19441 AAGAAATTTT CGCAGACAAA ACTTTTCAAC CTGAACCACA AGTAGGAGAA TCTCAATGGA 
19501 ACGAAGCGGA TGCCACAGCA GCTGGTGGAA GGGTTCTTAA AAAGACAACT CCCATGAAAC 
19561 CCTGCTATGG CTCATACGCT AGACCCACCA ATTCCAACGG CGGACAGGGC GTTATGGTTG 
19621 AACAAAATGG TAAATTGGAA AGTCAAGTCG AAATGCAATT TTTTTC C AC A TCCACAAATG 
19681 CCACAAATGA AGTTAACAAT ATACAACCAA CAGTTGTATT GTAC AGCGAA GATGTAAACA 
19741 TGGAAACTCC AGATACTCAT CTTTCTTATA AACCTAAAAT GGGGGATAAA AATGCCAAAG 
19801 TCATGCTTGG ACAACAAGCA ATGCCAAACA GACCAAATTA CATTGCTTTT AGAGACAATT 
19861 TTATTGGTCT CATGTATTAC AACAGCACAG GTAACATGGG TGTCCTTGCT GGTCAGGCAT 
19921 CGCAGTTGAA CGCTGTTGTA GATTTGCAAG ACAGAAACAC AGAGCTGTCC TACCAGCTTT 
19981 TGCTTGATTC AATTGGCGAC AGAACAAGAT ACTTTTCAAT GTGGAATCAA GCTGTTGACA 
20041 GCTATGATCG AGATGTCAGA ATTATTGAGA ACCATGGAAC TGAGGATGAG TTGCCAAATT 
20101 ATTGCTTTCC TCTTGGTGGA ATTGGGATTA C TGAC ACTTT TCAAGCTGTT AAAACAACTG 
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20161 CTGCTAACGG GGACCAAGGC AATACTACCT GGCAAAAAGA TTCAACATTT GCAGAACGCA 
202 21 ATGAAATAGG GGTGGGAAAT AACTTTGCCA TGGAAATTAA CCTGAATGCC AACCTATGGA 
20281 GAAATTTCCT TTACTCCAAT ATTGCGCTGT ACCTGCGAGA CAAGCTAAAA TACAACCCCA 
20341 CCAATGTGGA AATATCTGAC AACCCCAACA CCTACGACTA CATGAACAAG CGAGTGGTGG 
20401 CTCCTGGGCT TGTAGACTGC TACATTAACC TTGGGGCGCG CTGGTCTCTG GACTACATGG 
20461 ACAACGTTAA TCCCTTTAAC CACCACCGCA ATGCGGGCCT GCGTTACCGC TCCATGTTGT 
20521 TGGGAAACGG CCGCTACGTG CCCTTTCACA TTCAGGTGCC CCAAAAGTTT TTTGCCATTA 
20581 AAAACCTCCT CCTCCTGCCA GGCTCATACA CATATGAATG GAACTTCAGG AAGGATGTTA 
20641 ACATGGTTCT GCAGAGCTCT CTGGGAAACG ACCTTAGAGT TGACGGGGCT AGCATTAAGT 
20701 TTGACAGCAT TTGTCTTTAC GCCACCTTCT TCCCCATGGC CCACAACACG GCCTCCACGC 
207 61 TGGAAGCCAT GCTCAGAAAT GACACCAACG ACCAGTCCTT TAATGACTAC CTTTCCGCCG 
20821 CCAACATGCT ATATCCCATA CCCGCCAACG CCACCAACGT GCCCATCTCC ATCCCATCGC 
20881 GCAACTGGGC AGCATTTCGC GGTTGGGCCT TCACACGCTT GAAGACAAAG GAAACCCCTT 
20941 CCCTGGGATC AGGCTACGAC CCTTACTACA CCTACTCTGG CTCCATACCA TACCTTGACG 
21001 GAACCTTCTA TCTTAATCAC ACCTTTAAGA AGGTGGCCAT TACTTTTGAC TCTTCTGTTA 
21061 GCTGGCCGGG CAACGACCGC CTGCTTACTC CCAATGAGTT TGAGATTAAG CGCTCAGTTG 
21121 ACGGGGAGGG CTATAACGTA GCTCAGTGCA ACATGACAAA GGACTGGTTC CTAGTGCAGA 
21181 TGTTGGCCAA CTACAATATT GGCTACCAGG GCTTCTACAT TCCAGAAAGC TACAAAGACC 
21241 GCATGTACTC GTTCTTCAGA AACTTCCAGC CCATGAGCCG GCAAGTGGTG GACGATACTA 
21301 AATACAAAGA TTATCAGCAG GTTGGAATTA TCCACCAGCA TAACAACTCA GGCTTCGTAG 
21361 GCTACCTCGC TCCCACCATG CGCGAGGGAC AAGCTTACCC CGC TAATGTT CCCTACCCAC 
21421 TAATAGGGAA AACCGCGGTT GATAGTATTA CCCAGAAAAA GTTTCTTTGC GACCGCACCC 
21481 TGTGGCGCAT CCCCTTCTCC AGTAACTTTA TGTCCATGGG TGCGCTCACA GACCTGGGCC 
21541 AAAACCTTCT CTACGCAAAC TCCGCCCACG CGCTAGACAT GACCTTTGAG GTGGATCCCA 
21601 TGGACGAGCC CACCCTTCTT TATGTTTTGT TTGAAGTCTT TGACGTGGTC CGTGTGCACC 
21661 AGCCGCACCG CGGCGTCATC GAGACCGTGT ACCTGCGCAC GCCCTTCTCG GCCGGCAACG 
21721 CCACAACATA AAGAAGCAAG CAACATCAAC AACAGCTGCC GCCATGGGCT CCAGTGAGCA 
21781 GGAACTGAAA GCCATTGTCA AAGATCTTGG TTGTGGGCCA TATTTTTTGG GCACCTATGA 
21841 CAAGCGCTTC CCAGGCTTTG TTTCCCCACA CAAGCTCGCG TGCGCCATAG TTAACACGGC 
21901 CGGTCGCGAG ACTGGGGGCG TACACTGGAT GGCCTTTGCC TGGAACCCGC GCTCAAAAAC 
219 61 ATGCTACCTC TTTGAGCCCT TTGGCTTTTC TGACCAACGT CTCAAGCAGG TTTACCAGTT 
^ 22021 TGAGTACGAG TCACTCCTGC GCCGTAGCGC CATTGCCTCT TCCCCCGACC GCTGTATAAC 
22081 GCTGGAAAAG TCCACCCAAA GCGTGCAGGG GCCCAACTCG GCCGCCTGTG GCCTATTCTG 
22141 CTGCATGTTT CTCCACGCCT TTGCCAACTG GCCCCAAACT CCCATGGATC ACAACCCCAC 
22201 CATGAACCTT ATTACCGGGG TACCCAACTC CATGCTTAAC AGTCCCCAGG TACAGCCCAC 
22261 CCTGCGCCGC AACCAGGAAC AGCTCTACAG CTTC CTGGAG CGCCACTCGC CCTACTTCCG 
22321 CAGCCACAGT GCGCAAATTA GGAGCGCCAC TTCTTTTTGT CACTTGAAAA ACATGTAAAA 
223 81 ATAATGTACT AGGAGACACT TTCAATAAAG GCAAATGTTT TTATTTGTAC ACTCTCGGGT 
22441 GATTATTTAC CCCCACCCTT GCCGTCTGCG CCGTTTAAAA ATCAAAGGGG TTCTGCCGCG 
22501 CATCGCTATG CGCCACTGGC AGGGACACGT TGCGATACTG GTGTTTAGTG CTCCAC TTAA 
22561 ACTCAGGCAC AACCATCCGC GGCAGCTCGG TGAAGTTTTC ACTCCACAGG CTGCGCACCA 
22621 TCACCAACGC GTTTAGCAGG TCGGGCGCCG ATATCTTGAA GTCGCAGTTG GGGCCTCCGC 
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22681 CCTGCGCGCG CGAGTTGCGA TACACAGGGT TACAGCACTG GAACACTATC AGCGCCGGGT 
22741 GGTGCACGCT GGCCAGCACG CTCTTGTCGG AGATCAGATC CGCGTCCAGG TCCTCCGCGT 
22801 TGCTCAGGGC GAACGGAGTC AACTTTGGTA GCTGCCTTCC CAAAAAGGGT GCATGCCCAG 
22861 GCTTTGAGTT GCACTCGCAC CGTAGTGGCA TCAGAAGGTG ACCGTGCCCA GTCTGGGCGT 
22921 TAGGATACAG CGCCTGCATG AAAGCCTTGA TCTGCTTAAA AGCCACCTGA GCCTTTGCGC 
22981 CTTCAGAGAA GAACATGCCG CAAGACTTGC CGGAAAACTG ATTGGCCGGA CAGGCCGCGT 
23041 CATGCACGCA GCACCTTGCG TCGGTGTTGG AGATCTGCAC CACATTTCGG CCCCACCGGT 
23101 TCTTCACGAT CTTGGCCTTG CTAGACTGCT CCTTCAGCGC GCGCTGCCCG TTTTCGCTCG 
23161 TCACATCCAT TTCAATCACG TGCTCCTTAT TTATCATAAT GCTCCCGTGT AGACACTTAA 
23221 GCTCGCCTTC GATCTCAGCG CAGCGGTGCA GCCACAACGC GCAGCCCGTG GGCTCGTGGT 
23281 GCTTGTAGGT TACCTCTGCA AACGACTGCA GGTACGCCTG CAGGAATCGC CCCATCATCG 
23341 TCACAAAGGT CTTGTTGCTG GTGAAGGTCA GCTGCAACCC GCGGTGCTCC TCGTTTAGCC 
23401 AGGTCTTGCA TACGGCCGCC AGAGCTTCCA CTTGGTCAGG CAGTAGCTTG AAGTTTGCCT 
23461 TTAGATCGTT ATCCACGTGG TACTTGTCCA TCAACGCGCG CGCAGCCTCC ATGCCCTTCT 
23521 CCCACGCAGA CACGATCGGC AGGCTCAGCG GGTTTATCAC CGTGCTTTCA CTTTCCGCTT 
23581 CACTGGACTC TTCCTTTTCC TCTTGCATCC GCATACCCCG CGCCACTGGG TCGTCTTCAT 
23641 TCAGCCGCCG CACCGTGCGC TTACCTCCCT TGCCGTGCTT GATTAGCACC GGTGGGTTGC 
23701 TGAAACCC AC CATTTGTAGC GCCACATCTT CTCTTTCTTC CTCGCTGTCC ACGATCACCT 
23761 CTGGGGATGG CGGGCGCTCG GGCTTGGGAG AGGGGCGCTT CTTTTTCTTT TTGGACGCAA 
23821 TGGCCAAATC CGCCGTCGAG GTCGATGGCC GCGGGCTGGG TGTGCGCGGC ACCAGCGCAT 
23881 CTTGTGACGA GTC TTCTTCG TCCTCGGACT CGAGACGCCG CCTCAGCCGC TTTTTTGGGG 
23941 GCGCGCGGGG AGGCGGCGGC GACGGCGACG GGGACGAGAC GTCCTCCATG GTTGGTGGAC 
24001 GTCGCGCCGC ACCGCGTCCG CGCTCGGGGG TGGTTTCGCG CTGCTCCTCT TCCCGACTGG 
24061 CCATTTCCTT CTCCTATAGG CAGAAAAAGA TCATGGAGTC AGTCGAGAAG GAGGACAGCC 
24121 TAACCGCCCC CTTTGAGTTC GCCACCACCG CCTCCACCGA TGCCGCCAAC GCGCCTACCA 
24181 CCTTCCCCGT CGAGGCACCC CCGCTTGAGG AGGAGGAAGT GATTATCGAG CAGGACCCAG 
24241 GTTTTGTAAG CGAAGACGAC GAAGATCGCT CAGTACCAAC AGAGGATAAA AAGCAAGACC 
24301 AGGACGACGC AGAGGCAAAC GAGGAACAAG TCGGGCGGGG GGACCAAAGG CATGGCGACT 
24361 ACCTAGATGT GGGAGACGAC GTGCTGTTGA AGCATCTGCA GCGCCAGTGC GCCATTATCT 
24421 GCGACGCGTT GCAAGAGCGC AGCGATGTGC CCCTCGCCAT AGCGGATGTC AGCCTTGCCT 
24481 ACGAACGCCA CCTGTTCTCA CCGCGCGTAC CCCCCAAACG CCAAGAAAAC GGCACATGCG 
24541 AGCCCAACCC GCGCCTCAAC TTCTACCCCG TATTTGCCGT GCCAGAGGTG CTTGCCACCT 
24601 ATCACATCTT TTTCCAAAAC TGCAAGATAC CCCTATCCTG CCGTGCCAAC CGCAGCCGAG 
24661 CGGACAAGCA GCTGGCCTTG CGGCAGGGCG C TGTC ATACC TGATATCGC C TCGCTCGACG 
24721 AAGTGCCAAA AATCTTTGAG GGTCTTGGAC GCGACGAGAA GCGCGCGGCA AACGCTCTGC 
24781 AACAAGAAAA CAGCGAAAAT GAAAGTCACT GTGGAGTGCT GGTGGAACTT GAGGGTGACA 
24841 ACGCGCGCCT AGCCGTGCTG AAACGCAGCA TC G AGGTC AC CCACTTTGCC TACCCGGCAC 
24901 TTAACCTACC CCCCAAGGTT ATGAGCACAG TCATGAGCGA GCTGATCGTG CGCCGTGCAC 
24961 GACCCCTGGA GAGGGATGCA AACTTGCAAG AACAAACCGA GGAGGGCCTA CCCGCAGTTG 
25021 GCGATGAGCA GCTGGCGCGC TGGCTTGAGA CGCGCGAGCC TGCCGACTTG GAGGAGCGAC 
25081 GCAAGCTAAT GATGGCCGCA GTGCTTGTTA CCGTGGAGCT TGAGTGCATG CAGCGGTTCT 
25141 TTGCTGACCC GGAGATGCAG CGCAAGCTAG AGGAAACGTT GCACTACACC TTTCGCCAGG 



FIG, 7J 



WO 03/031588 



PCT/US02/32512 



52/92 

25201 GCTACGTGCG CCAGGCCTGC AAAATTTCCA 
25261 TTGGAATTTT GCACGAAAAC CGCCTTGGGC 
25321 AGGCGCGCCG CGACTACGTC CGCGACTGCG 
253 81 CGGCCATGGG CGTGTGGCAG CAGTGCCTGG 
25441 TGCTAAAGCA AAACTTGAAG GACCTATGGA 
25501 ACCTGGCGGA CATTATCTTC CCCGAACGCC 
25561 ACTTCACCAG TCAAAGCATG TTGCAAAACT 
25621 TTCTGCCCGC CACCTGCTGT GCGCTTCCTA 
25681 GCCCTCCGCC GCTTTGGGGT CACTGCTACC 
25741 ACTCCGACAT CATGGAAGAC GTGAGCGGTG 
25801 ACCTATGCAC CCCGCACCGC TCCCTGGTCT 
25861 TTATCGGTAC CTTTGAGCTG CAGGGTCCCT 
25921 TGAAACTCAC TCCGGGGCTG TGGACGTCGG 
25981 ACCACGCCCA CGAGATTAGG TTCTACGAAG 
26041 CCGCCTGCGT CATTACCCAG GGCCACATCC 
26101 GCCAAGAGTT TCTGCTACGA AAGGGACGGG 
26161 AGCTCAACCC AATCCCCCCG CCGCCGCAGC 
26221 AGGATGGCAC CCAAAAAGAA GCTGCAGCTG 
26281 TACTGGGACA GTCAGGCAGA GGAGGTTTTG 
26341 GACAGCCTAG ACGAGGAAGC TTCCGAGGCC 
26401 TCGGTC GC AT TCCCCTCGCC GGCGCCCCAG 
26461 ACCTCCGCTC CTCAGGCGCC GCCGGCACTG 
26521 ACCACTGGAA CCAGGGCCGG TAAGTCTAAG 
26581 CAGCGCCAAG GCTACCGCTC GTGGCGCGTG 
26641 GACTGTGGGG GCAACATCTC CTTCGCCCGC 
2 6701 TTCCCCCGTA ACATCCTGCA TTACTACCGT 
26761 AGCGGCAGCA ACAGCAGCGG CCACGCAGAA 
26821 AAAGCCCAAG AAATCCACAG CGGCGGCAGC 
26881 CAACGAACCC GTATCGACCC GCGAGCTTAG 
26941 ATTTCAACAG AGCAGGGGCC AAGAACAAGA 
27001 CCTCACCCGC AGCTGCCTGT ATCACAAAAG 
27061 CGCGGAGGCT CTCTTCAGCA AATACTGCGC 
27121 TTCTCAAATT TAAGCGC GAA AACTACGTCA 
27181 TGTCGTCAGC GCCATTATGA GCAAGGAAAT 
27241 ACAAATGGGA CTTGCGGCTG GAGCTGCCCA 
273 01 CGCGGGACCC CACATGATAT CCCGGGTCAA 
273 61 CCTCGAACAG GCGGCTATTA CCACCACACC 
27421 CGCTGCCCTG GTGTACCAGG AAAGTCCCGC 
27481 CCAGGCCGAA GTTCAGATGA CTAACTCAGG 
27541 GGTGCGGTCG CCCGGGCAGG GTATAACTCA 
27601 CAACGACGAG TCGGTGAGCT CCTCTCTTGG 
27661 CGGCGCTGGC CGCTCTTCAT TTACGCCCCG 



ACGTGGAGCT CTGCAACCTG GTCTCCTACC 
AAAACGTGCT TCATTCCACG CTCAAGGGCG 
TTTACTTATT TCTGTGCTAC ACCTGGCAAA 
AGGAGCGCAA CCTGAAGGAG CTGCAGAAGC 
CGGCCTTCAA CGAGCGCTCC GTGGCCGCGC 
TGCTTAAAAC CCTGCAACAG GGTCTGCCAG 
TTAGGAACTT TATCCTAGAG CGTTCAGGAA 
GCGACTTTGT GCCCATTAAG TACCGTGAAT 
TTCTGCAGCT AGCCAACTAC CTTGCCTACC 
ACGGCCTACT GGAGTGTCAC TGTCGCTGCA 
GCAATTCACA ACTGCTTAGC GAAAGTCAAA 
CGCCTGACGA AAAGTCCGCG GCTCCGGGGT 
CTTACCTTCG CAAATTTGTA CCTGAGGACT 
ACCAATCCCG CCCGCCAAAT GCGGAGCTTA 
TTGGCCAATT GCAAGCCATT AACAAAGCCC 
GGGTTTACTT GGACCCCCAG TCCGGCGAGG 
CCTATCAGCA GCCGCGGGCC CTTGCTTCCC 
CCGCCGCCGC CACCCACGGA CGAGGAGGAA 
GACGAGGAGG AGGAGATGAT GGAAGACTGG 
GAAGAGGTGT CAGACGAAAC ACCGTCACCC 
AAATCGGCAA CCGTTCCCAG CATTGCTACA 
CCCGTTCGCC GACCCAACCG TAGATGGGAC 
CAGCCGCCGC CGTTAGC CCA AGAGCAACAA 
CACAAGAACG CCATAGTTGC TTGCTTGCAA 
CGCTTTCTTC TCTACCATCA CGGCGTGGCC 
CATCTCTACA GCCCCTACTG CACCGGCGGC 
GCAAAGGCGA CCGGATAGCA AGACTCTGAC 
AGCAGGAGGA GGAGCACTGC GTCTGGCGCC 
AAACAGGATT TTTCCCACTC TGTATGCTAT 
GCTGAAAATA AAAAACAGGT CTCTGCGCTC 
CGAAGATCAG CTTCGGCGCA CGCTGGAAGA 
GCTGACTCTT AAGGACTAGT TTCGCGCCCT 
TCTCCAGCGG CCACACCCGG CGCCAGCACC 
TCCCACGCCC TACATGTGGA GTTACCAGCC 
AGACTACTCA ACCCGAATAA ACTACATGAG 
CGGAATCCGC GCCCACCGAA ACCGAATTCT 
TCGTAATAAC CTTAATCCCC GTAGTTGGCC 
TCCCACCACT GTGGTACTTC CCAGAGACGC 
GGCGCAGCTT GCGGGCGGCT TTCGTCACAG 
CCTGAAAATC AGAGGGCGAG GTATTCAGCT 
TCTCCGTCCG GACGGGACAT TTCAGATCGG 
TCAGGCGATC CTAACTCTGC AGACCTCGTC 
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27721 CTCGGAGCCG CGCTCCGGAG GCATTGGAAC TCTACAATTT ATTGAGGAGT TCGTGCCTTC 
27781 GGTTTACTTC AACCCCTTTT CTGGACCTCC CGGCCACTAC CCGGACCAGT TTATTCCCAA 
27841 CTTTGACGCG GTAAAAGACT CGGCGGACGG CTACGACTGA ATGACCAGTG GAGAGGCAGA 
27901 GCAACTGCGC CTGACACACC TCGACCACTG GCGCCGCCAG AAGTGC TTTG CCCGCGGCTC 
27961 CGGTGAGTTT TGTTACTTTG AATTGCCCGA AGAGCATATC GAGGGCCCGG CGCACGGCGT 
28021 CCGGCTCACC ACCCAGGTAG AGCTTACACG TAGCCTGATT CGGGAGTTTA CCAAGCGCCC 
28081 CCTGCTAGTG GAGCGGGAGC GGGGTCCCTG TGTTCTGACC GTGGTTTGCA ACTGTCCTAA 
28141 CCCTGGATTA CATCAAGATC TTTGTTGTCA TCTCTGTGCT GAGTATAATA AATACAGAAA 
28201 TTAGAATCTA CTGGGGCTCC TGTCGCCATC CTGTGAACGC CACCGTTTTT ACCCACCCAA 
282 61 AGCAGACCAA AGCAAACCTC ACCTCCGGTT TGCACAAGCG GGCCAATAAG TACCTTACCT 
28321 GGTACTTTAA CGGCTCTTCA TTTGTAATTT ACAACAGTTT CCAGCGAGAC GAAGTAAGTT 
28381 TGCCACACAA CCTTCTCGGC TTCAACTACA CCGTCAAGAA AAACACCACC ACCACCCTCC 
28441 TCACCTGCCG GGAACGTACG AGTGCGTCAC CGGTTGCTGC GCCCACACCT ACAGCCTGAG 
28i501 CGTAACCAGA CATTACTCCC ATTTTCCCAA AACAGGAGGT GAGCTCAACT CCCGGAACTC 
28561 AGGTCAAAAA AGCATTTTGC GGGGTGCTGG GATTTTTTAA TTAAGTATAT GAGCAATTCA 
28621 AGTAACTCTA CAAGCTTGTC TAATTTTTCT GGAATTGGGG TCGGGGTTAT CCTTACTCTT 
28681 GTAATTCTGT TTATTCTTAT ACTAGCACTT CTGTGCCTTA GGGTTGCCGC CTGCTGCACG 
28741 CACGTTTGTA CCTATTGTCA GCTTTTTAAA CGCTGGGGGC GACATCCAAG ATGAGGTACA 
28801 TGATTTTAGG CTTGCTCGCC CTTGCGGCAG TCTGCAGCGC TGCCAAAAAG GTTGAGTTTA 
28861 AGGAACCAGC TTGCAATGTT ACATTTAAAT CAGAAGCTAA TGAATGCACT AC TC TT AT AA 
28921 AATGCACCAC AGAACATGAA AAGCTTATTA TTCGCCACAA AGACAAAATT GGC AAGTATG 
28981 CTGTATATGC TATTTGGCAG CCAGGTGACA CTAACGACTA TAATGTCACA GTCTTCCAAG 
29041 GTGAAAATCG TAAAACTTTT ATGTATAAAT TTCCATTTTA TGAAATGTGC GATATTACCA 
29101 TGTACATGAG CAAACAGTAC AAGTTGTGGC CCCCACAAAA GTGTTTAGAG AACACTGGCA 
29161 CCTTTTGTTC CACCGCTCTG CTTATTACAG CGCTTGCTTT GGTATGTACC TTACTTTATC 
29221 TCAAATACAA AAGCAGACGC AGTTTTATTG ATGAAAAGAA AATGCCTTGA TTTTCCGCTT 
29281 GCTTGTATTC CCCTGGACAA TTTACTCTAT GTGGGATATG CGCCAGGCGG GAAAGATTAT 
29341 ACCCACAACC TTCAAATCAA ACTTTCCTGG ACGTTAGCGC CTGACTTCTG CCAGCGCCTG 
29401 CACTGCAAAT TTGATC AAAC CCAGCTTCAG CTTGCCTGCT CCAGAGATGA CCGGCTCAAC 
29461 CATCGCGCCC ACAACGGACT ATCGCAACAC CACTGCTACC GGACTAAAAT CTGCCCTAAA 
29521 TTTACCCCAA GTTCATGCCT TTGTCAATGA CTGGGCGAGC TTGGGCATGT GGTGGTTTTC 
29581 CATAGCGCTT ATGTTTGTTT GCCTTATTAT TATGTGGCTT ATTTGTTGCC TAAAGCGCAG 
29641 ACGCGCCAGA CCCCCCATCT ATAGGCCTAT CATTGTGCTC AACCCACACA ATGAAAAAAT 
29701 TCATAGATTG GACGGTCTCA AACCATGTTC TCTTCTTTTA CAGTATGATT AAATGAGACA 
29761 TGATTCCTCG AGTCCTTATA TTATTGACCC TTGTTGC GCT TTTCTGTGCG TGCTCTACAT 
29821 TGGCTGCGGT CGCTCACATC GAAGTAGATT GCATCCCACC TTTCACAGTT TACCTGCTTT 
29881 ACGGATTTGT CACCCTTATC CTCATCTGCA GCCTCGTCAC TGTAGTCATC GCCTTCATTC 
29941 AGTTCATTGA CTGGATTTGT GTGCGCATTG CGTACCTTAG GCACCATCCG CAATACAGAG 
30001 ACAGGACTAT AGCTGATCTT CTCAGAATTC TTTAATTATG AAACGGATTG TCACTTTTGT 
30061 TTTGCTGATT TTCTGCGCCC TACCTGTGCT TTGCTCCCAA ACCTCAGCGC CTCCCAAAAG 
30121 ACATATTTCC TGCAGATTCA CTCAAATATG GAACATTCCC AGCTGCTACA ACAAACAGAG 
30181 CGATTTGTCA GAAGCCTGGT TATACGCCAT CATCTCTGTC ATGGTTTTTT GCAGTACCAT 
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30241 TTTTGCCCTA GCCATATACC CATACCTTGA CATTGGTTGG AATGCCATAG ATGCCATGAA 
30301 CCACCCTACT TTCCCAGCGC CCAATGTCAT ACCACTGCAA CAGGTTATTG CCCCAATCAA 
30361 TCAGCCTCGC CCCCCTTCTC CCACCCCCAC TGAGATTAGC TACTTTAATT TGACAGGTGG 
30421 AGATGACTGA ATCTCTAGAT CTAGAATTGG ATGGAATTAA CACCGAACAG CGCCTACTAG 
30481 AAAGGCGCAA GGCGGCGTCC GAGCGAGAAC GCCTAAAACA AGAAGTTGAA GACATGGTTA 
30541 ACCTGCACCA GTGTAAAAGA GGTATCTTTT GTGTGGTCAA GCAGGCCAAA CTTACCTACG 
30601 AAAAAACCAC TACCGGCAAC CGCCTTAGCT ACAAGCTACC CACCCAGCGC CAAAAACTGG 
30661 TGCTTATGGT GGGAGAAAAA CCTATCACCG TCACCCAGCA CTCGGCAGAA ACAGAAGGCT 
30721 GCCTGCACTT CCCCTATCAG GGTCCAGAGG ACCTCTGCAC TCTTATTAAA ACCATGTGTG 
.30781 GCATTAGAGA TCTTATTCCA TTCAACTAAC AATAAACACA CAATAAATTA CTTACTTAAA 
30841 ATCAGTCAGC AAATCTTTGT CCAGCTTATT CAGCATCACC TCCTTTCCCT CCTCCCAACT 
30901 CTGGTATTTC AGCAGCCTTT TAGCTGCGAA CTTTCTCCAA AGTCTAAATG GGATGTCAAA 
30961 TTCCTCATGT TCTTGTCCCT CCGCACCCAC TATCTTCATA TTGTTGCAGA TGAAACGCGC 
31021 CAGACCGTCT GAAGAC AC CT TCAACCCTGT GTACCCATAT GACACGGAAA CCGGCCCTCC 
31081 AACTGTGCCT TTCCTTACCC CTCCCTTTGT GTCGCCAAAT GGGTTCCAAG AAAGTCCCCC 
31141 CGGAGTGCTT TCTTTGCGTC TTTCAGAACC TTTGGTTACC TCACACGGCA TGCTTGCGCT 
31201 AAAAATGGGC AGCGGCCTGT CCCTGGATCA GGCAGGCAAC CTTACATCAA ATACAATCAG 
31261 TGTTTCTCAA CCGCTAAAAA AAACAAAGTC CAATATAACT TTGGAAACAT CCGCGCCCCT 
31321 TACAGTCAGC TCAGGCGCCC TAACCATGGC CACAACTTCG CCTTTGGTGG TCTCTGACAA 
31381 CACTCTTACC ATGCAATCAC AAGCACCGCT AACCGTGCAA GACTCAAAAC TTAGCATTGC 
31441 TACCAAAGAG CCACTTACAG TGTTAGATGG AAAACTGGCC CTGCAGACAT CAGCCCCCCT 
31501 CTCTGCCACT GATAACAACG CCCTCACTAT CACTGCCTCA CCTCCTCTTA CTACTGCAAA 
31561 TGGTAGTCTG GCTGTTACCA TGGAAAACCC ACTTTACAAC AACAATGGAA AACTTGGGCT 
31621 CAAAATTGGC GGTCCTTTGC AAGTGGCCAC CGACTCACAT GCACTAACAC TAGGTACTGG 
31681 TCAGGGGGTT GCAGTTCATA ACAATTTGCT ACATACAAAA GTTACAGGCG CAATAGGGTT 
31741 TGATACATCT GGCAACATGG AACTTAAAAC TGGAGATGGC CTCTATGTGG ATAGCGCCGG 
31801 TCCTAACCAA AAACTACATA TTAATCTAAA TACCACAAAA GGCCTTGCTT TTGACAACAC 
31861 CGCAATAACA ATTAACGCTG GAAAAGGGTT GGAATTTGAA ACAGACTCCT CAAACGGAAA 
31921 TCCCATAAAA ACAAAAATTG GATCAGGCAT ACAATATAAT ACCAATGGAG CTATGGTTGC 
31981 AAAAC TTGG A ACAGGCCTCA GTTTTGACAG CTCCGGAGCC ATAACAATGG GCAGCATAAA 
32041 CAATGACAGA CTTACTCTTT GGACAACACC AGACCCATCC CCAAATTGCA GAATTGCTTC 
* 32101 AGATAAAGAC TGCAAGCTAA CTCTGGCGCT AACAAAATGT GGCAGTCAAA TTTTGGGCAC 
32161 TGTTTCAGCT TTGGCAGTAT CAGGTAATAT GGCCTCCATC AATGGAACTC TAAGCAGTGT 
32221 AAACTTGGTT CTTAGATTTG ATGACAACGG AGTGCTTATG TCAAATTCAT CACTGGACAA 
32281 ACAGTATTGG AACTTTAGAA ACGGGGACTC CACTAACGGT CAACCATACA CTTATGCTGT 
32341 TGGGTTTATG CCAAACCTAA AAGCTTACCC AAAAACTCAA AGTAAAACTG CAAAAAGTAA 
32401 TATTGTTAGC CAGGTGTATC TTAATGGTGA CAAGTCTAAA CCATTGCATT TTACTATTAC 
32461 GCTAAATGGA ACAGATGAAA CCAACCAAGT AAGCAAATAC TCAATATCAT TCAGTTGGTC 
32521 CTGGAACAGT GGACAATACA CTAATGACAA ATTTGCCACC AATTCCTATA CCTTCTCCTA 
32581 CATTGCCCAG GAATAAAGAA TCGTGAACCT GTTGCATGTT ATGTTTCAAC GTGTTTATTT 
32641 TTCAATTGCA GAAAATTTCA AGTCATTTTT CATTCAGTAG TATAGCCCCA CCACCACATA 
32701 GCTTATACTA ATCACCGTAC CTTAATCAAA CTCACAGAAC CCTAGTATTC AACCTGCCAq 
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3276i CTCCCTCCCA ACACACAGAG TACACAGTCC TTTCTCCCCG GCTGGCCTTA AACAGCATCA 
32821 TATCATGGGT AACAGACATA TTCTTAGGTG TTATATTCCA CACGGTCTCC TGTCGAGCCA 
32881 AACGCTCATC AGTGATGTTA ATAAACTCCC CGGGCAGCTC GCTTAAGTTC ATGTCGCTGT 
32941 CCAGCTGCTG AGCCACAGGC TGCTGTCCAA CTTGCGGTTG CTCAACGGGC GGCGAAGGAG 
33001 AAGTCCACGC C TAC ATGGGG GTAGAGTCAT AATCGTGCAT CAGGATAGGG CGGTGGTGCT 
33061 GCAGCAGCGC GCGAATAAAC TGCTGCCGCG GCCGCTCCGT CGTGCAGGAA TACAACATGG 
33i21 CAGTGGTCTC CTCAGCGATG ATTCGCACCG CCCGCAGCAT AAGGCGCCTT GTCCTCCGGG 
33181 CACAGCAGCG CACCCTGATC TCACTTAAGT CAGCACAGTA ACTGCAGCAC AGTACCACAA 
33241 TATTGTTTAA AATCCCACAG TGCAAGGCGC TGTATCCAAA GCTCATGGCG GGGACCACAG 
33301 AACCCACGTG GCCATCATAC CACAAGCGCA GGTAGATTAA GTGGCGACCC CTCATAAACA 
33361 CGCTGGACAT AAACATTACC TCTTTTGGCA TGTTGTAATT CACCACCTCC CGGTACCATA 
33421 TAAACCTCTG ATTAAACATG GCGCCATCCA CCACCATCCT AAACCAGCTG GCCAAAACCT 
33481 GCCCGCCGGC TATGCACTGC AGGGAACCGG GACTGGAACA ATGACAGTGG AGAGCCCAGG 
33541 ACTCGTAACC ATGGATCATC ATGCTCGTCA TGATATCAAT GTTGGCACAA CACAGGCACA 
33601 CGTGCATACA CTTCCTCAGG ATTACAAGCT CCTCCCGCGT CAGAACCATA TCCCAGGGAA 
33661 CAACCCATTC CTGAATCAGC GTAAATCCCA CACTGCAGGG AAGACCTCGC ACGTAACTCA 
33721 CGTTGTGCAT TGTCAAAGTG TTACATTCGG GCAGCAGCGG ATGATCCTCC AGTATGGTAG 
33781 CGCGTGTCTC TGTCTCAAAA GGAGGTAGGC GATCCCTACT GTACGGAGTG CGCCGAGACA 
33841 ACCGAGATCG TGTTGGTCGT AGTGTCATGC CAAATGGAAC GCCGGACGTA GTCATATTTC 
33901 CTGAAGCAAA ACCAGGTGCG GGCGTGACAA ACAGATCTGC GTCTCCGGTC TCGTCGCTTA 
33961 GCTCGCTCTG TGTAGTAGTT GTAGTATATC CACTCTCTCA AAGCATCCAG GCGCCCCCTG 
34021 GCTTCGGGTT CTATGTAAAC TCCTTCATGC GCCGCTGCCC TGATAACATC CACCACCGCA 
34081 GAATAAGCCA CACCCAGCCA ACCTACACAT TCGTTCTGCG AGTCACACAC GGGAGGAGCG 
34141 GGAAGAGCTG GAAGAACCAT GTTTTTTTTT TTTATTCCAA AAGATTATCC AAAACCTCAA 
34201 AATGAAGATC TATTAAGTGA ACGCGCTCCC CTCCGGTGGC GTGGTCAAAC TCTACAGCCA 
34261 AAGAACAGAT AATGGCATTT GTAAGATGTT GCACAATGGC TTCCAAAAGG CAAACTGCCC 
34321 TCACGTCCAA GTGGACGTAA AGGCTAAACC CTTCAGGGTG AATCTCCTCT ATAAACATTC 
34381 CAGCACCTTC AACCATGCCC AAATAATTTT CATCTCGCCA C CTTATC AAT ATGTCTCTAA 
34441 GCAAATCCCG AATATTAAGT CCGGCCATTG TAAAAATCTG CTCCAGAGCG CCCTCCACCT 
34501 TCAGCCTCAA GCAGCGAATC ATGATTGCAA AAATTCAGGT TCCTCACAGA CCTGTATAAG 
34561 ATTCAAAAGC GGAACATTAA CAAAAATACC GCGATCCCGT AGGTCCCTTC GCAGGGCCAG 
34621 CTGAACATAA TCGTGCAGGT CTGCACGGAC CAGCGCGGCC ACTTCCCCGC CAGGAACCAT 
34 681 GACAAAAGAA CCCACACTGA TTATGACACG CATACTCGGA GCTATGCTAA CCAGCGTAGC 
34741 CCCGATGTAA GCTTGTTGCA TGGGCGGCGA TATAAAATGC AAGGTACTGC TCAAAAAATC 
34801 AGGCAAAGCC TCGCGCAAAA AAGCAAGCAC ATCGTAGTCA TGCTCATGCA GATAAAGGCA 
34861 GGTAAGTTCC GGAACCACCA CAGAAAAAGA CACCATTTTT CTCTCAAACA TGTCTGCGGG 
34921 TTCCTGCATA AACACAAAAT AAAATAACAA AAAAAAAAAA ACATTTAAAC ATTAGAAGCC 
34981 TGTNTTACAA CAGGAAAAAC AACCCTTATA AGCATAAGAC GGAC TACGGC CATGCCGGCG 
35041 TGACCGTAAA AAAACTGGTC ACCGTGATTA AAAAGCACCA CCGACAGTTC CTCGGTCATG 
35101 TCCGGAGTCA TAATGTAAGA CTC GGTAAAC ACATCAGGTT GGTTAACATC GGTCAGTGCT 
35161 AAAAAGCGAC CGAAATAGCC CGGGGGAATA CATACCCGCA GGCGTAGAGA CAACATTACA 
35221 GCCCCCATAG GAGGTATAAC AAAATTAATA GGAGAGAAAA ACACATAAAC ACCTGAAAAA 
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35281 CCCTCCTGCC TAGGCAAAAT AGCACCCTCC CGCTCCAGAA CAACATACAG CGCTTCCACA 
35341 GCGGCAGCCA TAACAGTCAG CCTTACCAGT AAAAAAACCT ATTAAAAAAC ACCACTCGAC 
35401 ACGGCACCAG CTCAATCAGT CACAGTGTAA AAAGGGCCAA GTACAGAGCG AGTATATATA 
35461 GGACTAAAAA ATGACGTAAC GGTTAAAGTC CACAAAAACC ACCCAGAAAA CCGCACGCGA 
35521 ACCTACGCCC AGAAACGAAA GCCAAAAAAC CCACAACTTC CTCAAATCTT CACTTCCGTT 
35581 TTCCCACGAT ACGTCACTTC CCATTTTAAA AAAAAACTAC AATTCCC7UVT AC ATGC AAGT 
35641 TACTCCGCCC TAAAACCTAC GTCACCCGCC CCGTTCCCAC GCCCCGCGCC ACGTCACAAA 
35701 CTCCACCCCC TCATTATCAT ATTGGCTTCA ATCCAAAATA AGGTATATTA TTGATGATG 
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1 CATCATCAAT AATATACCTT ATTTTGGATT 
61 TTGTGACGTG GCGCGGGGCG TGGGAACGGG 
121 GATGTTGCAA GTGTGGCGGA ACACATGTAA 
181 GTGTGCGCCG GTGTACACAG GAAGTGACAA 
241 TAAATTTGGG CGTAACCGAG TAAGATTTGG 
301 AGTGAAATCT GAATAATTTT GTGTTACTCA 
361 GACTTTGACC GTTTACGTGG AGACTCGCCC 
421 CGGGTCAAAG TTGGCGTTTT ATTATTATAG 
481 TGAGTTCCTC AAGAGGCCAC TCTTGAGTGC 
541 TCCGACACCG GGACTGAAAA TGAGACATAT 
601 AATGGCCGCC AGTCTTTTGG ACCAGCTGAT 
661 TCCTAGCCAT TTTGAACCAC CTACCCTTCA 
721 CGAAGATCCC AACGAGGAGG CGGTTTCGCA 
781 GCAGGAAGGG ATTGACTTAC TCACTTTTCC 
841 CCTTTCCCGG CAGCCCGAGC AGCCGGAGCA 
901 CCTTGTACCG GAGGTGATCG ATCTTACCTG 
961 CGAGGATGAA GAGGGTGAGG AGTTTGTGTT 
1021 CAGGTCTTGT CATTATCACC GGAGGAATAC 
1081 CTATATGAGG ACCTGTGGCA TGTTTGTCTA 
1141 TAGAGTGGTG GGTTTGGTGT GGTAATTTTT 
1201 GAATTTTGTA TTGTGATTTT TTTAAAAGGT 
1261 CCAGAACCGG AGCCTGCAAG ACCTACCCGC 
1321 CGCCCGACAT CACCTGTGTC TAGAGAATGC 
13 81 CCTTCTAACA CACCTCCTGA GATACACCCG 
1441 GCCGTGAGAG TTGGTGGGCG TCGCCAGGCT 
1501 CCTGGGCAAC CTTTGGACTT GAGCTGTAAA 
1561 TTGCGTGTGT GGTTAACGCC TTTGTTTGCT 
1621 GAGATAATGT TTAACTTGCA TGGCGTGTTA 
1681 CGCCGTGGGC TAATCTTGGT TACATCTGAC 
1741 TTTTCTGCTG TGCGTAACTT GCTGGAACAG 
1801 TTTCTGTGGG GCTCATCCCA GGCAAAGTTA 
1861 GAATTTGAAG AGCTTTTGAA ATCCTGTGGT 
1921 CAGGCGCTTT TCCAAGAGAA GGTCATCAAG 
1981 GCGGCTGCTG TTGCTTTTTT GAGTTTTATA 
2041 AGCGGGGGGT ACCTGC TGGA TTTTCTGGCC 
2101 AAGAATCGCC TGCTACTGTT GTCTTCCGTC 
2161 CAGCAGCAGC AGGAGGAAGC CAGGCGGCGG 
2221 GCCGGCCTGG ACCCTCGGGA ATGAATGTTG 
2281 GACGCATTTT GACAATTACA GAGGATGGGC 
2341 GGGCTTGTGA GGCTACAGAG GAGGCTAGGA 
2401 GTCCTGAGTG TATTACTTTT CAACAGATCA 
2461 TGGCGCAGAA GTATTCCATA GAGCAGCTGA 
2521 TTGAGGAGGC TATTAGGGTA TATGCAAAGG 
2 581 TCAGCAAACT TGTAAATATC AGGAATTGTT 
2641 AGATAGATAC GGAGGATAGG GTGGCCTTTA 
2701 TGCTTGGCAT GGACGGGGTG GTTATTATGA 
2761 GTACGGTTTT CCTGGCCAAT ACCAACCTTA 
2 821 ACAATACCTG TGTGGAAGCC TGGACCGATG 
2881 GCTGGAAGGG GGTGGTGTGT CGCCCCAAAA 

2 941 AAAGGTGTAC CTTGGGTATC CTGTCTGAGG 
3001 CCGACTGTGG TTGCTTCATG CTAGTGAAAA 

3 06i GTGGCAACTG CGAGGACAGG GCCTCTCAGA 
3121 TGCTGAAGAC CATTCACGTA GCCAGCCACT 
3181 AC ATAC TG AC CCGCTGTTCC TTGCATTTGG 
3241 AATGCAATTT GAGTCACACT AAGATATTGC 



GAAGCCAATA TGATAATGAG GGGGTGGAGT 
GCGGGTGACG TAGTAGTGTG GCGGAAGTGT' 
GCGACGGATG TGGCAAAAGT GACGTTTTTG 
TTTTCGCGCG GTTTTAGGCG GATGTTGTAG 
CCATTTTCGC GGGAAAACTG AATAAGAGGA 
TAGCGCGTAA TATTTGTCTA GGGCCGCGGG 
AGGTGTTTTT CTCAGGTGTT TTCCGCGTTC 
TCAGCTGACG TGTAGTGTAT TTATACCCGG 
CAGCGAGTAG AGTTTTCTCC TCCGAGCCGC 
TATCTGCCAC GGAGGTGTTA TTACCGAAGA 
CGAAGAGGTA CTGGCTGATA ATCTTCCACC 
CGAACTGTAT GATTTAGACG TGACGGCCCC 
GATTTTTCCC GACTCTGTAA TGTTGGCGGT 
GCCGGCGCCC GGTTCTCCGG AGCCGCCTCA 
GAGAGCCTTG GGTCCGGTTT CTATGCCAAA 
CCACGAGGCT GGC TTTCC AC CCAGTGACGA 
AGATTATGTG GAGCACCCCG GGCACGGTTG 
GGGGGACCCA GATATTATGT GTTCGCTTTG 
CAGTAAGTGA AAATTATGGG CAGTGGGTGA 
TTTTTAATTT TTACAGTTTT GTGGTTTAAA 
CCTGTGTCTG AAC CTGAGCC TGAGCCCGAG 
CGTC CTAAAA TGGCGCCTGC TATCCTGAGA 
AATAGTAGTA CGGATAGCTG TGACTCCGGT 
GTGGTCCCGC TGTGCCCCAT TAAACCAGTT 
GTGGAATGTA TCGAGGACTT GCTTAACGAG 
CGCCCCAGGC CATAAGGTGT AAACCTGTGA 
GAATGAGTTG ATGTAAGTTT AATAAAGGGT 
AATGGGGCGG GGCTTAAAGG GTATATAATG 
CTC ATGGAGG CTTGGGAGTG TTTGGAAGAT 
AGCTCTAACA GTACCTCTTG GTTTTGGAGG 
GTCTGCAGAA TTAAGGAGGA TTACAAGTGG 
GAGCTGTTTG ATTCTTTGAA TCTGGGTCAC 
ACTTTGGATT TTTCCACACC GGGGCGCGCT 
AAGGATAAAT GGAGCGAAGA AACCCATCTG 
ATGCATCTGT GGAGAGCGGT TGTGAGACAC 
CGCCCGGCGA TAATACCGAC GGAGGAGCAG 
CGGCAGGAGC AGAGCCCATG GAACCCGAGA 
TACAGGTGGC TGAACTGTAT CCAGAACTGA 
AGGGGCTAAA GGGGGTAAAG AGGGAGCGGG 
ATCTAGCTTT TAGCTTAATG ACCAGACACC 
AGGATAATTG CGCTAATGAG CTTGATCTGC 
CCACTTACTG GCTGCAGCCA GGGGATGATT 
TGGCACTTAG GCCAGATTGC AAGTACAAGA 
GCTACATTTC TGGGAACGGG GCCGAGGTGG 
GATGTAGCAT GATAAATATG TGGCCGGGGG 
ATGTAAGGTT TACTGGCCCC AATTTTAGCG 
TCCTACACGG TGTAAGCTTC TATGGGTTTA 
TAAGGGTTCG GGGCTGTGCC TTTTACTGCT 
GCAGGGCTTC AATTAAGAAA TGCCTCTTTG 
GTAACTCCAG GGTGCGCCAC AATGTGGCCT 
GCGTGGCTGT GATTAAGCAT AACATGGTAT 
TGCTGACCTG CTCGGACGGC AACTGTC AC C 
CTCGCAAGGC CTGGCCAGTG TTTGAGCATA 
GTAACAGGAG GGGGGTGTTC CTACCTTACC 
TTGAGCCCGA GAGCATGTCC AAGGTG AAC C . 
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33 01 TGAACGGGGT GTTTGACATG ACCATGAAGA TCTGGAAGGT GC TGAGGT AC GATGAGACCC 
3361 GCACCAGGTG CAGACCCTGC GAGTGTGGCG GTAAACATAT TAGGAACCAG CCTGTGATGC 
3421 TGGATGTGAC CGAGGAGCTG AGGCCCGATC ACTTGGTGCT GGCCTGCACC CGCGCTGAGT 
3481 TTGGCTCTAG CGATGAAGAT ACAGATTGAG GTACTGAAAT GTGTGGGCGT GGCTTAAGGG 
3541 TGGGAAAGAA TATATAAGGT GGGGGTCTTA TGTAGTTTTG TATCTGTTTT GCAGCAGCCG 
3601 CCGCCGCCAT GAGCACCAAC TCGTTTGATG GAAGCATTGT GAGCTCATAT TTGACAACGC 
3661 GCATGCCCCC ATGGGCCGGG GTGCGTCAGA ATGTGATGGG CTCCAGCATT GATGGTCGCC 
3721 CCGTCCTGCC CGCAAACTCT ACTAC CTTGA CCTACGAGAC CGTGTCTGGA AC GCCGTTGG 
3781 AGACTGCAC3C CTCCGCCGCC GCTTCAGCCG CTGCAGCCAC CGCCCGCGGG ATTGTGACTG 
3841 ACTTTGCTTT CCTGAGCCCG CTTGCAAGCA GTGCAGCTTC CCGTTCATCC GCCCGCGATG 
3901 ACAAGTTGAC GGCTCTTTTG GCACAATTGG ATTCTTTGAC CCGGGAACTT AATGTC GTTT 
3961 CTCAGCAGCT GTTGGATCTG CGCCAGCAGG TTTCTGCCCT GAAGGCTTCC TCCCCTCCCA 
4021 ATGCGGTTTA AAACATAAAT AAAAAACCAG ACTCTGTTTG GATTTGGATC AAGCAAGTGT 
4081 CTTGCTGTCT TTATTTAGGG GTTTTGCGCG CGCGGTAGGC CCGGGACCAG CGGTCTCGGT 
4141 CGTTGAGGGT CCTGTGTATT TTTTCCAGGA CGTGGTAAAG GTGACTCTGG ATGTTC AGAT 
42 01 ACATGGGCAT AAGCCCGTCT CTGGGGTGGA GGTAGCACCA CTGCAGAGCT TCATGCTGCG 

42 61 GGGTGGTGTT GTAGATGATC CAGTCGTAGC AGGAGCGCTG GGCGTGGTGC CTAAAAATGT 
4321 CTTTCAGTAG CAAGCTGATT GCCAGGGGCA GGCCCTTGGT GTAAGTGTTT ACAAAGCGGT 

43 81 TAAGCTGGGA TGGGTGCATA CGTGGGGATA TGAGATGCAT CTTGGACTGT ATTTTTAGGT 
4441 TGGCTATGTT CCCAGCCATA TCCCTCCGGG GATTCATGTT GTGCAGAACC ACCAGCACAG 
4501 TGTATCCGGT GCACTTGGGA AATTTGTCAT GTAGCTTAGA AGGAAATGCG TGGAAGAACT 
4561 TGGAGACGCC CTTGTGACCT CCAAGATTTT CCATGCATTC GTCCATAATG ATGGCAATGG 
4621 GCCCACGGGC GGCGGCCTGG GCGAAGATAT TTCTGGGATC ACTAACGTCA TAGTTGTGTT 
4681 CCAGGATGAG ATCGTCATAG GCCATTTTTA CAAAGCGCGG GCGGAGGGTG CCAGACTGCG 
4741 GTATAATGGT TCCATCCGGC CCAGGGGCGT AGTTACCCTC ACAGATTTGC ATTTCCCACG 
4801 CTTTGAGTTC AGATGGGGGG ATCATGTCTA CCTGCGGGGC GATGAAGAAA AC GGTTTCCG 
4861 GGGTAGGGGA GATCAGCTGG GAAGAAAGCA GGTTCCTGAG CAGCTGCGAC TTACCGCAGC 
4921 CGGTGGGCCC GTAAATCACA CCTATTACCG GGTGCAACTG GTAGTTAAGA GAGCTGCAGC 
4981 TGCCGTCATC CCTGAGCAGG GGGGCCACTT CGTTAAGCAT GTCCCTGACT CGCATGTTTT 
5041 CCCTGACCAA ATCCGCCAGA AGGCGCTCGC CGCCCAGCGA TAGCAGTTCT TGCAAGGAAG 
5101 CAAAGTTTTT CAACGGTTTG AGACCGTCCG CCGTAGGCAT GCTTTTGAGC GTTTGACCAA 
5161 GCAGTTCCAG GCGGTCCCAC AGCTCGGTCA CCTGCTCTAC GGCATCTCGA TCCAGCATAT 
5221 CTCCTCGTTT CGCGGGTTGG GGCGGCTTTC GC TGTACGGC AGTAGTCGGT GCTCGTCCAG 
5281 ACGGGCCAGG GTCATGTCTT TCCACGGGCG CAGGGTCCTC GTCAGCGTAG TCTGGGTCAC 
5341 GGTGAAGGGG TGCGCTCCGG GCTGCGCGCT GGCCAGGGTG CGCTTGAGGC TGGTCCTGCT 
5401 GGTGCTGAAG CGCTGCCGGT CTTCGCCCTG CGCGTCGGCC AGGTAGCATT TGACCATGGT 
5461 GTCATAGTCC AGCCCCTCCG CGGCGTGGCC CTTGGCGCGC AGCTTGCCCT TGGAGGAGGC 
5521 GCCGCACGAG GGGCAGTGCA GACTTTTGAG GGCGTAGAGC TTGGGCGCGA GAAATACCGA 
5581 TTCCGGGGAG TAGGCATCCG CGCCGCAGGC CCCGCAGACG GTCTCGCATT CCACGAGCCA 
5641 GGTGAGCTCT GGCCGTTCGG GGTCAAAAAC CAGGTTTCCC CCATGCTTTT TGATGCGTTT 
5701 CTTACCTCTG GTTTCCATGA GC CGGTGTCC ACGCTCGGTG ACGAAAAGGC TGTCCGTGTC 
5761 CCCGTATACA GACTTGAGAG GCCTGTCCTC GAGCGGTGTT CCGCGGTCCT CCTCGTATAG 
5821 AAACTCGGAC CACTCTGAGA CAAAGGCTCG CGTCCAGGCC AGCACGAAGG AGGCTAAGTG 
5881 GGAGGGGTAG CGGTCGTTGT CCACTAGGGG GTCCACTCGC TCCAGGGTGT GAAGACACAT 
5941 GTCGCCCTCT TCGGCATCAA GGAAGGTGAT TGGTTTGTAG GTGTAGGCCA CGTGACCGGG 
6001 TGTTCCTGAA GGGGGGCTAT AAAAGGGGGT GGGGGCGCGT TCGTCCTCAC TCTCTTCCGC 
6061 ATCGC TGTCT GCGAGGGCCA GCTGTTGGGG TGAGTACTCC CTCTGAAAAG CGGGCATGAC 
6121 TTCTGCGCTA AGATTGTCAG TTTCCAAAAA CGAGGAGGAT TTGATATTCA CCTGGCCCGC 
6181 GGTGATGCCT TTGAGGGTGG CCGCATCCAT CTGGTCAGAA AAGAC AATCT TTTTGTTGTC 
6241 AAGCTTGGTG GCAAACGACC CGTAGAGGGC GTTGGACAGC AACTTGGCGA TGGAGC GC AG 
6301 GGTTTGGTTT TTGTCGCGAT CGGCGCGCTC CTTGGCCGCG ATGTTTAGCT GCACGTATTC 
6361 GCGCGCAACG CACCGCCATT CGGGAAAGAC GGTGGTGCGC TCGTCGGGCA CCAGGTGCAC 
6421 GCGCCAACCG CGGTTGTGCA GGGTGACAAG GTCAACGCTG GTGGC TACCT CTCCGCGTAG 
6481 GCGCTCGTTG GTCCAGCAGA GGCGGCCGCC CTTGCGCGAG CAGAATGGCG GTAGGGGGTC 
6541 TAGCTGCGTC TCGTCCGGGG GGTCTGCGTC CACGGTAAAG ACCCCGGGCA GCAGGCGCGC 
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6601 GTCGAAGTAG TCTATCTTGC ATCCTTGCAA 
6661 AAGCGCGCGC TCGTATGGGT TGAGTGGGGG 
6721 GGCGTACATG CCGCAAATGT CGTAAACGTA 
6781 AGGGTAGCAT CTTCCACCGC GGATGCTGGC 
6841 AGCGAGGAGG TCGGGACCGA GGTTGCTACG 
6901 CCTGAAGATG GCATGTGAGT TGGATGATAT 
6961 GTCTGTGAGA CCTACCGCGT CACGCACGAA 
7021 CAGCTCGGCG GTGACCTGCA CGTCTAGGGC 
7081 ATACTTATCC TGTCCCTTTT TTTTCCACAG 
7141 TTTCCAGTAC TCTTGGATCG GAAACCCGTC 
7201 GAACTGGTTG ACGGCCTGGT AGGCGCAGCA 
7261 CGCGGCCTTC CGGAGCGAGG TGTGGGTGAG 
7321 GTACTGGTAT TTGAAGTCAG TGTCGTCGCA 
7381 GCGCTTTTTG GAACGCGGAT TTGGCAGGGC 
7441 CGCGCGAGGC ATAAAGTTGC GTGTGATGCG 
7501 AATTACCTGG GCGGCGAGCA CGATCTCGTC 
7561 AAGTTCCAAG AAGCGCGGGA TGCCCTTGAT 
7621 GAGCTCTTCA GGGGAGCTGA GCCCGTGCTC 
7681 GGAAGCGACG AATGAGCTCC AC AGGTC AC G 
7741 GGTCCTAAAC TGGCGACCTA TGGCCATTTT 
7801 GTCTTGTTCC CAGCGGTCCC ATCCAAGGTT 
7861 AGGCTCATCT CCGCCGAACT TCATGACCAG 
7921 CCCCATCCAA GTATAGGTCT CTACATCGTA 
7981 CGAGCCGATC GGGAAGAACT GGATCTCCCG 
8041 GTGAAAGTAG AAGTCCCTGC GACGGGCCGA 
8101 GCAGTACTGG CAGCGGTGCA CGGGCTGTAC 
8161 CACAAGGAAG CAGAGTGGGA ATTTGAGCCC 
8221 TACTTCGGCT GCTTGTCCTT GACCGTCTGG 
8281 CACCACGCCG CGCGAGCCCA AAGTCCAGAT 
8341 AACATCGCGC AGATGGGAGC TGTCCATGGT 
8401 GAGCTCCTGC AGGTTTACCT CGCATAGACG 
8461 CCTAATTTCC AGGGGCTGGT TGGTGGCGGC 
8521 CGGCGCGACT ACGGTACCGC GCGGCGGGCG 
8581 ATCTAAAAGC GGTGACGCGG GCGAGCCCCC 
8641 AGAGGGGGCA GGGGCACGTC GGCGCCGCGC 
8701 TTGCTGGCGA ACGCGACGAC GCGGCGGTTG 
8761 ACGACGGGCC CGGTGAGCTT GAGCCTGAAA 
8821 TTGACGGCGG CCTGGCGCAA AATCTCCTGC 
8881 TCGGCCATGA ACTGCTCGAT CTCTTCCTCC 
8941 GTGGCGGCGA GGTCGTTGGA AATGCGGGCC 
9001 TCGTTCCAGA CGCGGCTGTA GACCACGCCC 
9061 TGC GCGAGAT TGAGCTC C AC GTGCCGGGCG 
9121 AGGTAGTTGA GGGTGGTGGC GGTGTGTTCT 
9181 AACGTGGATT CGTTGATATC CCCCAAGGCC 
9241 ACGGCGAAGT TGAAAAACTG GGAGTTGCGC 
9301 CGGATGAGCT CGGCGACAGT GTCGCGCACC 
9361 TCTTCTTCAA TCTCCTCTTC CATAAGGGCC 
9421 GGAGGGGGGA CACGGCGGCG ACGACGGCGC 
9481 ATCTCCCCGC GGCGACGGCG CATGGTCTCG 
9541 AGTTGGAAGA CGCCGCCCGT CATGTCCCGG 
9601 AGGGATACGG CGCTAACGAT GCATCTCAAC 
9661 GACCTGAGCG AGTCCGCATC GACCGGATCG 
9721 TCACAGTCGC AAGGTAGGCT GAGCACCGTG 
9781 TTTCTGGCGG AGGTGCTGCT GATGATGTAA 
9841 GTCGACAGAA GCACCATGTC CTTGGGTCCG 



GTCTAGCGCC TGCTGCCATG CGCGGGCGGC 
ACCCCATGGC ATGGGGTGGG TGAGCGCGGA 
GAGGGGCTCT CTGAGTATTC CAAGATATGT 
GCGCACGTAA TCGTATAGTT CGTGCGAGGG 
GGCGGGCTGC TCTGCTCGGA AGACTATCTG 
GGTTGGACGC TGGAAGACGT TGAAGCTGGC 
GGAGGCGTAG GAGTCGCGCA GC TTGTTG AC 
GCAGTAGTCC AGGGTTTCCT TGATGATGTC 
CTCGCGGTTG AGGACAAACT CTTCGCGGTC 
GGCCTCCGAA CGGTAAGAGC CTAGCATGTA 
TCCCTTTTCT ACGGGTAGCG CGTATGCCTG 
CGCAAAGGTG TCCCTGACCA TG ACTTTG AG 
TCCGCCCTGC TCCCAGAGCA AAAAGTCCGT 
GAAGGTGACA TCGTTGAAGA GTATCTTTCC 
GAAGGGTCCC GGCACCTCGG AACGGTTGTT 
AAAGCCGTTG ATGTTGTGGC CCACAATGTA 
GGAAGGCAAT TTTTTAAGTT CCTCGTAGGT 
TGAAAGGGCC CAGTCTGCAA GATGAGGGTT 
GGCCATTAGC ATTTGCAGGT GGTCGCGAAA 
TTC TGGGGTG ATGCAGTAGA AGGTAAGCGG 
CGCGGCTAGG TCTCGCGCGG CAGTCACTAG 
CATGAAGGGC ACGAGCTGCT TCCCAAAGGC 
GGTGACAAAG AGACGCTCGG TGCGAGGATG 
CCACCAATTG GAGGAGTGGC TATTGATGTG 
ACACTCGTGC TGGCTTTTGT AAAAACGTGC 
ATCCTGCACG AGGTTGACCT GACGACCGCG 
CTCGCCTGGC GGGTTTGGCT GGTGGTCTTC 
CTGCTCGAGG GGAGTTACGG TGGATCGGAC 
GTCCGCGCGC GGCGGTCGGA GCTTGATGAC 
CTGGAGCTCC CGCGGCGTCA GGTCAGGCGG 
GGTCAGGGCG CGGGCTAGAT CCAGGTGATA 
GTCGATGGCT TGCAAGAGGC CGCATCCCCG 
GTGGGCCGCG GGGGTGTCCT TGGATGATGC 
GGAGGTAGGG GGGGCTCCGG ACCCGCCGGG 
GCGGGCAGGA GCTGGTGCTG CGCGCGTAGG 
ATCTCCTGAA TCTGGCGCCT CTGCGTGAAG 
GAGAGTTCGA CAGAATCAAT TTCGGTGTCG 
ACGTCTCCTG AGTTGTCTTG ATAGGCGATC 
TGGAGATCTC CGCGTCCGGC TCGCTCCACG 
ATGAGC TGCG AG AAGGC GTT GAGGCCTCCC 
CCTTC GGCAT CGCGGGCGCG CATGACCACC 
AAGAC GGCGT AGTTTCGCAG GCGCTGAAAG 
GCC AC GAAG A AGTACATAAC CCAGCGTCGC 
TCAAGGCGCT CCATGGCCTC GTAGAAGTCC 
GCCGACACGG TTAACTCCTC CTCCAGAAGA 
TCGCGCTCAA AGGCTACAGG GGCCTCTTCT 
TCCCCTTCTT CTTCTTCTGG CGGCGGTGGG 
ACCGGGAGGC GGTCGACAAA GCGCTCGATC 
GTGACGGCGC GGCCGTTCTC GCGGGGGCGC 
TTATGGGTTG GCGGGGGGCT GCCATGCGGC 
AATTGTTGTG TAGGTACTCC GCCGCCGAGG 
GAAAACCTCT CGAGAAAGGC GTCTAACCAG 
GCGGGCGGCA GCGGGCGGCG GTCGGGGTTG 
TTAAAGTAGG CGGTCTTGAG ACGGCGGATG 
GCCTGCTGAA TGCGCAGGCG GTCGGCCATG 
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9901 CCCCAGGCTT CGTTTTGACA TCGGCGCAGG TCTTTGTAGT AGTCTTGCAT GAGCCTTTCT 
9961 ACCGGCACTT CTTCTTCTCC TTCCTCTTGT CCTGCATCTC TTGCATCTAT CGCTGCGGCG 
10021 GCGGCGGAGT TTGGCCGTAG GTGGCGCCCT CTTCCTCCCA TGCGTGTGAC CCCGAAGCCC 
10081 CTCATCGGCT GAAGCAGGGC TAGGTCGGCG ACAACGCGCT CGGCTAATAT GGCCTGCTGC 
10141 ACCTGCGTGA GGGTAGACTG GAAGTCATCC ATGTCCACAA AGCGGTGGTA TGCGCCCGTG 
10201 TTGATGGTGT AAGTGCAGTT GGCCATAACG GACCAGTTAA CGGTCTGGTG ACCCGGCTGC 
10261 GAGAGCTCGG TGTACCTGAG ACGCGAGTAA GCCCTCGAGT CAAATACGTA GTCGTTGCAA 
10321 GTCCGCACCA GGTACTGGTA TCCCACCAAA AAGTGCGGCG GCGGCTGGCG GTAGAGGGGC 
10381 CAGCGTAGGG TGGCCGGGGC TCCGGGGGCG AGATCTTCCA ACATAAGGCG ATGATATCCG 
10441 TAGATGTACC TGGACATCCA GGTGATGCCG GCGGCGGTGG TGGAGGCGCG CGGAAAGTCG 
10501 CGGACGCGGT TCCAGATGTT GCGCAGCGGC AAAAAGTGCT CCATGGTCGG GACGCTCTGG 
10561 CCGGTCAGGC GCGCGCAATC GTTGAC GCTC TAGACCGTGC AAAAGGAGAG CCTGTAAGCG 
10621 GGCACTCTTC CGTGGTCTGG TGGATAAATT CGCAAGGGTA TCATGGCGGA CGACCGGGGT 
10681 TCGAGCCCCG TATCCGGCCG TCCGCCGTGA TCCATGCGGT TACCGCCCGC GTGTCGAACC 
10741 CAGGTGTGCG ACGTCAGACA ACGGGGGAGT GCTCCTTTTG GCTTCCTTCC AGGCGCGGCG 
10801 GCTGCTGCGC TAGCTTTTTT GGCCACTGGC CGCGCGCAGC GTAAGCGGTT AGGCTGGAAA 
10861 GCGAAAGCAT TAAGTGGCTC GCTCCCTGTA GCCGGAGGGT TATTTTCCAA GGGTTGAGTC 
10921 GCGGGACCCC CGGTTCGAGT CTCGGACCGG CCGGACTGCG GCGAACGGGG GTTTGCCTCC 
10981 CCGTCATGCA AGACCCCGCT TGCAAATTCC TCCGGAAACA GGGACGAGCC CCTTTTTTGC 
11041 TTTTCCCAGA TGCATCCGGT GCTGCGGCAG ATGCGCCCCC CTCCTCAGCA GCGGCAAGAG 
11101 CAAGAGCAGC GGCAGACATG CAGGGCACCC TCCCCTCCTC CTACCGCGTC AGGAGGGGCG 
11161 ACATCCGCGG TTGACGCGGC AGCAGATGGT GATTACGAAC CCCCGCGGCG CCGGGCCCGG 
11221 CACTACCTGG ACTTGGAGGA GGGCGAGGGC CTGGCGCGGC TAGGAGCGCC CTCTCCTGAG 
11281 CGGTACCCAA GGGTGCAGCT GAAGCGTGAT ACGCGTGAGG CGTACGTGCC GCGGCAGAAC 
11341 CTGTTTCGCG ACCGCGAGGG AGAGGAGCCC GAGGAGATGC GGGATCGAAA GTTCCACGCA 
11401 GGGCGCGAGC TGCGGCATGG CCTGAATCGC GAGCGGTTGC TGCGCGAGGA GGAC TTTGAG 
11461 CCCGACGCGC GAACCGGGAT TAGTCCCGCG CGCGCACACG TGGCGGCCGC CGACCTGGTA 
11521 ACCGCATACG AGCAGACGGT GAACCAGGAG ATTAACTTTC AAAAAAGCTT TAACAACCAC 
11581 GTGCGTACGC TTGTGGCGCG CGAGGAGGTG GCTATAGGAC TGATGCATCT GTGGGACTTT 
11641 GTAAGCGCGC TGGAGCAAAA CCCAAATAGC AAGCCGCTCA TGGCGCAGCT GTTCCTTATA 
11701 GTGCAGCACA GCAGGGACAA CGAGGCATTC AGGGATGCGC TGCTAAACAT AGTAGAGCCC 
117 61 GAGGGCCGCT GGCTGCTCGA TTTGATAAAC ATCCTGCAGA GCATAGTGGT GCAGGAGCGC 
11821 AGCTTGAGCC TGGCTGACAA GGTGGCCGCC ATCAACTATT CCATGCTTAG CCTGGGCAAG 
11881 TTTTACGCCC GCAAGATATA CCATACCCCT TACGTTCCCA TAGACAAGGA GGTAAAGATC 
11941 GAGGGGTTCT ACATGCGCAT GGCGCTGAAG GTGCTTACCT TGAGCGACGA CCTGGGCGTT 
12001 TATCGCAACG AGCGCATCCA CAAGGCCGTG AGCGTGAGCC GGCGGCGCGA GCTCAGCGAC 
12061 CGGGAGCTGA TGCACAGCCT GCAAAGGGCC CTGGCTGGCA CGGGCAGCGG CGATAGAGAG 
12121 GCCGAGTCCT ACTTTGACGC GGGCGCTGAC CTGCGCTGGG CCCCAAGCCG ACGCGCCCTG 
12181 GAGGCAGCTG GGGCCGGACC TGGGCTGGCG GTGGCACCCG CGCGCGCTGG CAACGTCGGC 
12241 GGCGTGGAGG AATATGACGA GGACGATGAG TACGAGCCAG AGGACGGCGA GTACTAAGCG 
12301 GTGATGTTTC TGATCAGATG ATGCAAGACG CAACGGACCC GGCGGTGCGG GCGGCGCTGC 
12361 AGAGCCAGCC GTCCGGCCTT AACTCCACGG ACGACTGGCG CCAGGTCATG GACCGCATCA 
12421 TGTCGCTGAC TGCGCGCAAT CCTGACGCGT TCCGGCAGCA GCCGCAGGCC AACCGGCTCT 
12481 CCGCAATTCT GGAAGCGGTG GTCCCGGCGC GCGCAAACCC CACGCACGAG AAGGTGCTGG 
12541 CGATCGTAAA CGCGCTGGCC GAAAACAGGG CCATCCGGCC CGACGAGGCC GGCCTGGTCT 
12 601 ACGACGCGCT GCTTCAGCGC GTGGCTCGTT ACAACAGCGG CAACGTGCAG ACCAACCTGG 
12661 ACCGGCTGGT GGGGGATGTG CGCGAGGCCG TGGCGCAGCG TGAGCGCGCG CAGCAGCAGG 
12721 GCAACCTGGG CTCCATGGTT GCACTAAACG CCTTCCTGAG TACACAGCCC GCCAACGTGC 
12781 CGCGGGGACA GGAGGACTAC ACCAACTTTG TGAGCGCACT GCGGCTAATG GTGACTGAGA 
12841 CACCGCAAAG TGAGGTGTAC CAGTCTGGGG CAGACTATTT TTTCCAGACC AGTAGACAAG 
12901 GCCTGCAGAC CGTAAACCTG AGCCAGGCTT TCAAAAACTT GCAGGGGCTG TGGGGGGTGC 
12961 GGGCTCCCAC AGGCGACCGC GCGACCGTGT CTAGCTTGCT GACGCCCAAC TCGCGCCTGT 
13021 TGCTGCTGCT AATAGCGCCC TTCACGGACA GTGGCAGCGT GTCCCGGGAC ACATACCTAG 
13081 GTCACTTGCT GACACTGTAC CGCGAGGCCA TAGGTCAGGC GCATGTGGAC GAGCATACTT 
13141 TCCAGGAGAT TACAAGTGTC AGCCGCGCGC TGGGGCAGGA GGACACGGGC AGCCTGGAGG 
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13201 CAACCCTAAA CTACCTGCTG ACCAACCGGC GGCAGAAGAT CCCCTCGTTG CACAGTTTAA 
13261 ACAGCGAGGA GGAGCGCATT TTGCGCTACG TGCAGCAGAG CGTGAGCCTT AACCTGATGC 
13321 GCGACGGGGT AACGCCCAGC GTGGCGCTGG ACATGACCGC GCGCAACATG GAACCGGGCA 
13381 TGTATGCCTC AAACCGGCCG TTTATCAACC GCCTAATGGA CTACTTGCAT CGCGCGGCCG 
13441 CCGTGAACCC CGAGTATTTC ACCAATGCCA TCTTGAACCC GCACTGGCTA CCGCCCCCTG 
13501 GTTTCTACAC CGGGGGATTC GAGGTGCCCG AGGGTAACGA TGGATTCCTC TGGGACGACA 
13561 TAGACGACAG CGTGTTTTCC CCGCAACCGC AGACCC TGCT AGAGTTGCAA CAGCGCGAGC 
13 621 AGGCAGAGGC GGCGCTGCGA AAGGAAAGCT TCCGCAGGCC AAGCAGCTTG TCCGATCTAG 
13681 GCGCTGCGGC CCCGCGGTCA GATGCTAGTA GCCCATTTCC AAGCTTGATA GGGTCTCTTA 
13741 CCAGCACTCG CACCACCCGC CCGCGCCTGC TGGGCGAGGA GGAGTACCTA AACAACTCGC 
13801 TGCTGCAGCC GCAGCGCGAA AAAAACCTGC CTCCGGCATT TCCCAACAAC GGGATAGAGA 
13861 GCCTAGTGGA CAAGATGAGT AGATGGAAGA CGTACGCGCA GGAGCACAGG GACGTGCCAG 
13921 GCCCGCGCCC GCCCACCCGT CGTCAAAGGC ACGACCGTCA GCGGGGTCTG GTGTGGG AGG 
13981 ACGATGACTC GGCAGACGAC AGCAGCGTCC TGGATTTGGG AGGGAGTGGC AACCCGTTTG 
14041 CGCACCTTCG CCCCAGGCTG GGGAGAATGT TTTAAAAAAA AAAAAGCATG ATGCAAAATA 
14101 AAAAACTCAC CAAGGCCATG GCACCGAGCG TTGGTTTTCT TGTATTCCCC TTAGTATGCG 
14161 GCGCGCGGCG ATGTATGAGG AAGGTCCTCC TCCCTCCTAC GAGAGTGTGG TGAGCGCGGC 
14221 GCCAGTGGCG GCGGCGCTGG GTTCTCCCTT CGATGCTCCC CTGGACCCGC CGTTTGTGCC 
14281 TCCGCGGTAC CTGCGGCCTA CCGGGGGGAG AAACAGCATC CGTTACTCTG AGTTGGCACC 
14341 CCTATTCGAC ACCACCCGTG TGTACCTGGT GGACAACAAG TCAACGGATG TGGCATCCCT 
14401 GAACTACCAG AACGACCACA GCAACTTTCT GACCACGGTC ATTCAAAACA ATGACTACAG 
14461 CCGGGGGGAG GCAAGCACAC AGACCATCAA TCTTGAC GAC CGGTCGCACT GGGGCGGCGA 
14521 CCTGAAAACC ATC CTGC AT A CCAACATGCC AAATGTGAAC GAGTTCATGT TTACC AATAA 
14581 GTTTAAGGCG CGGGTGATGG TGTCGCGCTT GCCTACTAAG GACAATCAGG TGGAGCTGAA 
14641 ATACGAGTGG GTGGAGTTCA CGCTGCCCGA GGGCAACTAC TC CGAGACCA TGACCATAGA 
14701 CCTTATGAAC AACGCGATCG TGGAGCACTA CTTGAAAGTG GGCAGACAGA ACGGGGTTCT 
14761 GGAAAGCGAC ATCGGGGTAA AGTTTGACAC CCGCAACTTC AGACTGGGGT TTGACCCCGT 
14821 CACTGGTCTT GTCATGCCTG GGGTATATAC AAACGAAGCC TTCCATCCAG ACATCATTTT 
14881 GCTGCCAGGA TGCGGGGTGG ACTTCACCCA CAGCCGCCTG AGCAACTTGT TGGGCATCCG 
14941 CAAGCGGCAA CCCTTCCAGG AGGGCTTTAG GATCACCTAC GATGATCTGG AGGGTGGTAA 
15001 CATTCCCGCA CTGTTGGATG TGGACGCCTA CCAGGCGAGC TTGAAAGATG ACACCGAACA 
15061 GGGCGGGGGT GGCGCAGGCG GCAGCAACAG CAGTGGCAGC GGCGCGGAAG AGAACTCCAA 
15121 CGCGGCAGCC GCGGCAATGC AGCCGGTGGA GGACATGAAC GATCATGCCA TTCGCGGCGA 
15181 CACCTTTGCC ACACGGGCTG AGGAGAAGCG CGCTGAGGCC GAAGCAGCGG CCGAAGCTGC 
15241 CGCCCCCGCT GCGCAACCCG AGGTCGAGAA GCCTCAGAAG AAACCGGTGA TCAAACCCCT 
15301 GACAGAGGAC AGCAAGAAAC GCAGTTACAA CCTAATAAGC AATGACAGCA CCTTCACCCA 
15361 GTACCGCAGC TGGTACCTTG CATACAACTA CGGCGACCCT CAGACCGGAA TCCGCTCATG 
15421 GACCCTGCTT TGCACTCCTG ACGTAACCTG CGGCTCGGAG CAGGTCTACT GGTCGTTGCC 
15481 AGACATGATG CAAGACCCCG TGACCTTCCG CTCCACGCGC CAGATCAGCA ACTTTCCGGT 
15541 GGTGGGCGCC GAGCTGTTGC CCGTGCACTC CAAGAGCTTC TACAACGACC AGGCCGTCTA 
15601 CTCCCAACTC ATCCGCCAGT TTACCTCTCT GACCCACGTG TTCAATCGCT TTCCCGAGAA 
15661 CCAGATTTTG GCGCGCCCGC CAGCCCCCAC CATCACCACC GTCAGTGAAA ACGTTCCTGC 
15721 TCTCACAGAT CACGGGACGC TACCGCTGCG CAACAGCATC GGAGGAGTCC AGCGAGTGAC 
15781 CATTACTGAC GCCAGACGCC GCACCTGCCC CTACGTTTAC AAGGCCCTGG GCATAGTCTC 
15841 GCCGCGCGTC CTATCGAGCC GCACTTTTTG AGCAAGCATG TCCATCCTTA TATCGCCCAG 
15901 CAATAACACA GGCTGGGGCC TGCGCTTCCC AAGCAAGATG TTTGGCGGGG CCAAGAAGCG 
15961 CTCCGACCAA CACCCAGTGC GCGTGCGCGG GCACTACCGC GCGCCCTGGG GCGCGCACAA 
16021 ACGCGGCCGC ACTGGGCGCA CCACCGTCGA TGACGCCATC GACGCGGTGG TGGAGGAGGC 
16081 GCGCAACTAC ACGCCCACGC CGCCACCAGT GTCCACAGTG GACGCGGCCA TTCAGACCGT 
16141 GGTGCGCGGA GCCCGGCGCT ATGCTAAAAT GAAGAGACGG CGGAGGCGCG TAGCACGTCG 
16201 CCACCGCCGC CGACCCGGCA CTGCCGCCCA ACGCGCGGCG GCGGCCCTGC TTAACCGCGC 
16261 ACGTCGCACC GGCCGACGGG CGGCCATGCG GGCCGCTCGA AGGCTGGCCG CGGGTATTGT 
16321 CACTGTGCCC CCCAGGTCCA GGCGACGAGC GGCCGCCGCA GCAGCCGCGG CCATTAGTGC 
16381 TATGACTCAG GGTCGCAGGG GCAACGTGTA TTGGGTGCGC GACTCGGTTA GCGGCCTGCG 
16441 CGTGCCCGTG CGCACCCGCC CCCCGCGCAA CTAGATTGCA AGAAAAAACT ACTTAGACTC 
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16501 GTACTGTTGT ATGTATCCAG CGGCGGCGGC GCGCAACGAA GCTATGTCCA AGCGCAAAAT 
16561 CAAAGAAGAG ATGCTCCAGG TCATCGCGCC GGAGATCTAT GGCCCCCCGA AGAAGGAAGA 
16621 GCAGGATTAC AAGCCCC GAA AGC TAAAGCG GGTCAAAAAG AAAAAGAAAG ATGATGATGA 
16681 TGAACTTGAC GACGAGGTGG AACTGCTGCA CGCTACCGCG CCCAGGCGAC GGGTACAGTG 
16741 GAAAGGTCGA CGCGTAAAAC GTGTTTTGCG ACCCGGCACC ACCGTAGTCT TTACGCCCGG 
16801 TGAGCGCTCC ACCCGCACCT ACAAGCGCGT GTATGATGAG GTGTACGGCG ACGAGGACCT 
16861 GCTTGAGCAG GCCAACGAGC GCCTCGGGGA GTTTGCCTAC GGAAAGCGGC ATAAGGACAT 
16921 GCTGGCGTTG CCGCTGGACG AGGGCAACCC AACACCTAGC CTAAAGCCCG TAACACTGCA 
16981 GCAGGTGCTG CCCGCGCTTG CACCGTCCGA AGAAAAGCGC GGCCTAAAGC GCGAGTCTGG 
17041 TGACTTGGCA CCCACCGTGC AGCTGATGGT ACCCAAGCGC CAGCGACTGG AAGATGTCTT 
17101 GGAAAAAATG ACCGTGGAAC CTGGGCTGGA GCCCGAGGTC CGCGTGCGGC CAATCAAGCA 
17161 GGTGGCGCCG GGACTGGGCG TGCAGACCGT GGACGTTCAG ATACCCACTA CCAGTAGCAC 
17221 CAGTATTGCC ACCGCCACAG AGGGCATGGA GACACAAACG TCCCCGGTTG CCTCAGCGGT 
17281 GGCGGATGCC GCGGTGCAGG CGGTCGCTGC GGCCGCGTCC AAGACCTCTA CGGAGGTGCA 
17341 AACGGACCCG TGGATGTTTC GCGTTTCAGC CCCCCGGCGC CCGCGCGGTT CGAGGAAGTA 
17401 CGGCGCCGCC AGCGCGCTAC TGCCCGAATA TGCCCTACAT CCTTCCATTG CGCCTACCCC 
17461 CGGCTATCGT GGCTACACCT ACCGCCCCAG AAGACGAGCA ACTACCCGAC GCCGAACCAC 
17521 CACTGGAACC CGGCGCCGCC GTCGCCGTCG CCAGCCCGTG CTGGCCCCGA TTTCCGTGCG 
17581 CAGGGTGGCT CGCGAAGGAG GCAGGACCCT GGTGCTGCCA ACAGCGCGCT ACCACCCCAG 
17641 CATCGTTTAA AAGCCGGTCT TTGTCSGTTCT TGCAGATATG GCCCTCACCT GCCGCCTCCG 
177 01 TTTCCCGGTG CCGGGATTCC GAGGAAGAAT GCACCGTAGG AGGGGCATGG CCGGCCACGG 
17761 CCTGACGGGC GGCATGCGTC GTGCGCACCA CCGGCGGCGG CGCGCGTCGC ACCGTCGCAT 
17821 GCGCGGCGGT ATCCTGCCCC TCCTTATTCC ACTGATCGCC GCGGC GATTG GCGCCGTGCC 
17881 CGGAATTGCA TCCGTGGCCT TGCAGGCGCA GAGACACTGA TTAAAAACAA GTTGCATGTG 
17941 GAAAAATCAA AATAAAAAGT CTGGACTCTC ACGCTCGCTT GGTCCTGTAA CTATTTTGTA 
18001 GAATGGAAGA CATCAACTTT GCGTCTCTGG CCCCGCGACA CGGCTCGCGC CCGTTCATGG 
18061 GAAACTGGCA AGATATCGGC ACCAGCAATA TGAGCGGTGG CGCCTTCAGC TGGGGCTCGC 
18121 TGTGGAGCGG CATTAAAAAT TTCGGTTCCA CCGTTAAGAA CTATGGCAGC AAGGCCTGGA 
18181 ACAGCAGCAC AGGCCAGATG CTGAGGGATA AGTTGAAAGA GCAAAATTTC CAACAAAAGG 
18241 TGGTAGATGG CCTGGCCTCT GGCATTAGCG GGGTGGTGGA CCTGGCCAAC CAGGCAGTGC 
183 01 AAAATAAGAT TAACAGTAAG CTTGATCCCC GCCCTCCCGT AGAGGAGCCT CCACCGGCCG 
183 61 TGGAGACAGT GTCTCCAGAG GGGCGTGGCG AAAAGCGTCC GCGCCCCGAC AGGGAAGAAA 
18421 CTCTGGTGAC GCAAATAGAC GAGCCTCCCT C GTACGAGGA GGCACTAAAG CAAGGCCTGC 
18481 CCACCACCCG TCCCATCGCG CCCATGGCTA CCGGAGTGCT GGGCCAGCAC ACACCCGTAA 
18541 CGCTGGACCT GCCTCCCCCC GCCGACACCC AGCAGAAACC TGTGCTGCCA GGCCCGACCG 
18601 CCGTTGTTGT AACCCGTCCT AGCCGCGCGT CCCTGCGCCG CGCCGCCAGC GGTCCGCGAT 
18661 CGTTGCGGCC CGTAGCCAGT GGCAACTGGC AAAGCACACT GAACAGCATC GTGGGTCTGG 
18721 GGGTGCAATC CCTGAAGCGC CGACGATGCT TCTGAATAGC TAACGTGTCG TATGTGTGTC 
18781 ATGTATGCGT CCATGTCGCC GCCAGAGGAG CTGCTGAGCC GCCGCGCGCC CGCTTTCCAA 
18841 GATGGCTACC C CTTCGATG A TGCCGCAGTG GTCTTACATG CACATCTCGG GCCAGGACGC 
18901 CTCGGAGTAC CTGAGCCCCG GGCTGGTGCA GTTTGCCCGC GCCACCGAGA CGTACTTCAG 
18961 CCTGAATAAC AAGTTTAGAA ACCCCACGGT GGCGCCTACG CACGACGTGA CCACAGACCG 
19021 GTCCCAGCGT TTGACGCTGC GGTTCATCCC TGTGGACCGT GAGGATACTG CGTACTCGTA 
19081 CAAGGCGCGG TTCACCCTAG CTGTGGGTGA TAACCGTGTG CTGGACATGG CTTCC ACGTA 
19141 CTTTGACATC CGCGGCGTGC TGGACAGGGG CCCTACTTTT AAGCCCTACT CTGGCACTGC 
19201 CTACAACGCC CTGGCTCCCA AGGGTGCCCC AAATCCTTGC GAATGGGATG AAGCTGCTAC 
192 61 TGCTCTTGAA ATAAACCTAG AAGAAGAGGA CGATGACAAC GAAGACGAAG TAGACGAGCA 
19321 AGCTGAGCAG CAAAAAACTC ACGTATTTGG GCAGGCGCCT TATTCTGGTA TAAATATTAC 
19381 AAAGGAGGGT ATTCAAATAG GTGTCGAAGG TCAAACACCT AAATATGCCG ATAAAACATT 
19441 TCAACCTGAA CCTCAAATAG GAGAATCTCA GTGGTACGAA ACTGAAATTA ATCATGCAGC 
19501 TGGGAGAGTC CTTAAAAAGA CTACCCCAAT GAAACCATGT TACGGTTCAT ATGCAAAACC 
19561 CACAAATGAA AATGGAGGGC AAGGCATTCT TGTAAAGCAA CAAAATGGAA AGCTAGAAAG 
19621 TCAAGTGGAA ATGCAATTTT TCTCAACTAC TGAGGCGACC GCAGGCAATG GTGATAACTT 
19681 GACTCCTAAA GTGGTATTGT ACAGTGAAGA TGTAGATATA GAAACCCCAG ACACTCATAT 
19741 TTCTTACATG CCCACTATTA AGGAAGGTAA CTCACGAGAA CTAATGGGCC AACAATCTAT 
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19801 GCCCAACAGG CCTAATTACA TTGCTTTTAG GGACAATTTT ATTGGTCTAA TGTATTACAA 
19861 CAGCACGGGT AATATGGGTG TTCTGGCGGG CCAAGCATCG CAGTTGAATG CTGTTGTAGA 
19921 TTTGCAAGAC AGAAACACAG AGCTTTCATA CCAGCTTTTG CTTGATTCCA TTGGTGATAG 
19981 AACCAGGTAC TTTTCTATGT GGAATCAGGC TGTTGACAGC TATGATCCAG ATGTTAGAAT 
20041 TATTGAAAAT CATGGAACTG AAGATGAACT TCCAAATTAC TGCTTTCCAC TGGGAGGTGT -\ 
20101 GATTAATACA GAGACTCTTA CCAAGGTAAA ACCTAAAACA GGTCAGGAAA ATGGATGGGA 
20161 AAAAGATGCT ACAGAATTTT CAGATAAAAA TGAAATAAGA GTTGGAAATA ATTTTGCCAT 
20221 GGAAATCAAT CTAAATGCCA ACCTGTGGAG AAATTTCCTG TACTCCAACA TAGCGCTGTA 
20281 TTTGCCCGAC AAGCTAAAGT ACAGTCCTTC CAACGTAAAA ATTTCTGATA ACCCAAACAC 
20341 CTACGACTAC ATGAACAAGC GAGTGGTGGC TCCCGGGTTA GTGGACTGCT ACATTAACCT 
2 0401 TGGAGC AC GC TGGTCCCTTG ACTATATGGA CAACGTCAAC CCATTTAACC ACCACCGCAA 
20461 TGCTGGCCTG CGCTACCGCT CAATGTTGCT GGGCAATGGT CGCTATGTGC CCTTCCACAT 
20521 CCAGGTGCCT CAGAAGTTCT TTGCCATTAA AAACCTCCTT CTCCTGCCGG GCTCATACAC 
20581 CTACGAGTGG AACTTCAGGA AGGATGTTAA C ATGGTTC TG CAGAGCTCCC TAGGAAATGA 
2 0641 CCTAAGGGTT GACGGAGCCA GCATTAAGTT TGATAGCATT TGCCTTTACG CCACCTTCTT 
20701 CCCCATGGCC CACAACACCG CCTCCACGCT TGAGGCCATG CTTAGAAACG ACACCAACGA 
20761 CCAGTCCTTT AACGACTATC TCTCCGCCGC CAACATGCTC TAG CCTAT AC CCGCCAACGC 
2 0821 TACCAACGTG CCCATATCCA TCCCCTCCCG CAACTGGGCG GCTTTCCGCG GCTGGGCCTT 
20881 CACGCGCCTT AAGACTAAGG AAACCCCATC ACTGGGCTCG GGCTACGACC CTTATTACAC 
2 0941 CTACTCTGGC TCTATACCCT ACCTAGATGG AACCTTTTAC CTCAACCACA CCTTTAAGAA 
21001 GGTGGCCATT ACCTTTGACT CTTCTGTCAG CTGGCCTGGC AATGACCGCC TGCTTACCCC 
21061 CAACGAGTTT GAAATTAAGC GCTCAGTTGA CGGGGAGGGT TACAACGTTG CCCAGTGTAA 
21121 CATGACCAAA GACTGGTTCC TGGTACAAAT GCTAGCTAAC TACAACATTG GCTACCAGGG 
21181 CTTCTATATC CCAGAGAGCT ACAAGGACCG CATGTACTCC TTCTTTAGAA ACTTCCAGCC 
21241 CATGAGCCGT CAGGTGGTGG ATGATACTAA ATACAAGGAC TACCAACAGG TGGGCATCCT 
21301 ACACCAACAC AACAACTCTG GATTTGTTGG CTACCTTGCC CCCACCATGC GCGAAGGACA 
21361 GGCCTACCCT GCTAACTTCC CCTATCCGCT TATAGGCAAG ACCGCAGTTG ACAGCATTAC 
21421 CCAGAAAAAG TTTCTTTGCG ATCGCACCCT TTGGCGCATC CCATTCTCCA GTAACTTTAT 
21481 GTCCATGGGC GCACTCACAG ACCTGGGCCA AAACCTTCTC TACGCCAACT CCGCCCACGC 
21541 GCTAGACATG ACTTTTGAGG TGGATC CC AT GGACGAGCCC ACCCTTCTTT ATGTTTTGTT 
21601 TGAAGTCTTT GACGTGGTCC GTGTGCACCG GCCGCACCGC GGCGTCATCG AAACCGTGTA 
21661 CCTGCGCACG CCCTTCTCGG CCGGCAACGC CACAACATAA AGAAGCAAGC AACATCAACA 
21721 ACAGCTGCCG CCATGGGCTC CAGTGAGCAG GAACTGAAAG CCATTGTCAA AGATCTTGGT 
21781 TGTGGGCCAT ATTTTTTGGG CACCTATGAC AAGCGCTTTC CAGGCTTTGT TTCTCCACAC 
21841 AAGCTCGCCT GCGCCATAGT CAATACGGCC GGTCGCGAGA CTGGGGGCGT ACACTGGATG 
21901 GCCTTTGCCT GGAACCCGCA CTCAAAAACA TGCTACCTCT TTGAGCCCTT TGGCTTTTCT 
21961 GACCAGCGAC TCAAGCAGGT TTACCAGTTT GAGTACGAGT CACTCCTGCG CCGTAGCGCC 
22021 ATTGCTTCTT CCCCCGACCG CTGTATAACG CTGGAAAAGT CCACCCAAAG CGTACAGGGG 
22081 CCCAACTCGG CCGCCTGTGG ACTATTCTGC TGCATGTTTC TCCACGCCTT TGCCAACTGG 
22141 CCCCAAACTC CCATGGATCA CAACCCCACC ATGAACCTTA TTACCGGGGT ACCCAACTCC 
22201 ATGCTCAACA GTCCCCAGGT ACAGCCCACC CTGCGTCGCA ACCAGGAACA GCTCTACAGC 
22261 TTCCTGGAGC GCCACTCGCC CTACTTCCGC AGCCACAGTG CGCAGATTAG GAGCGCCAC* 
22321 TCTTTTTGTC ACTTGAAAAA CATGTAAAAA TAATGTACTA GAGACACTTT CAATAAAGGC 
22381 AAATGCTTTT ATTTGTACAC TCTCGGGTGA TTATTTACCC CCACCCTTGC CGTCTGCGCC 
22441 GTTTAAAAAT CAAAGGGGTT CTGCCGCGCA TCGCTATGCG CCACTGGCAG GGACACGTTG 
22501 CGATACTGGT GTTTAGTGCT CCACTTAAAC TCAGGCACAA CCATCCGCGG CAGCTCGGTG 
22561 AAGTTTTC AC TCCACAGGCT GCGCACCATC ACCAACGCGT TTAGCAGGTC GGGCGCCGAT- 
22621 ATCTTGAAGT CGCAGTTGGG GCCTCCGCCC TGCGCGCGCG AGTTGC GAT A CACAGGGTTG 
22681 CAGCACTGGA ACACTATCAG CGCCGGGTGG TGCACGCTGG CCAGCACGCT CTTGTCGGAG 
22741 ATCAGATCCG CGTCCAGGTC CTCCGCGTTG CTCAGGGCGA ACGGAGTCAA CTTTGGTAGC 
22801 TGCCTTCCCA AAAAGGGCGC GTGCCCAGGC TTTGAGTTGC ACTCGCACCG TAGTGGCATC 
22861 AAAAGGTGAC CGTGCCCGGT CTGGGCGTTA GGATACAGCG CCTGCATAAA AGCCTTGATC 
22921 TGCTTAAAAG CCACCTGAGC CTTTGCGCCT TCAGAGAAGA ACATGCCGCA AGACTTGCCG 
22981 GAAAACTGAT TGGCCGGACA GGCCGCGTCG TGCACGCAGC ACCTTGCGTC GGTGTTGGAG 
23041 ATCTGCACCA CATTTCGGCC CCACCGGTTC TTCACGATCT TGGCCTTGCT AGACTGCTCC 
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23101 TTCAGCGCGC GCTGCCCGTT TTCGCTCGTC ACATCCATTT CAATCACGTG CTCCTTATTT 
23161 ATCATAATGC TTCCGTGTAG ACACTTAAGC TCGCCTTCGA TCTCAGCGCA GCGGTGCAGC 
23221 CACAACGCGC AGCCCGTGGG CTCGTGATGC TTGTAGGTCA CCTCTGCAAA CGACTGCAGG 
23281 TACGCCTGCA GGAATCGCCC CATCATCGTC ACAAAGGTCT TGTTGCTGGT GAAGGTCAGC 
23341 TGCAACCCGC GGTGCTCCTC GTTCAGCCAG GTCTTGCATA CGGCCGCCAG AGCTTCCACT 
23401 TGGTCAGGCA GTAGTTTGAA GTTCGCCTTT AGATCGTTAT CCACGTGGTA CTTGTCCATC 
23461 AGCGCGCGCG CAGCCTCCAT GCCCTTCTCC CACGCAGACA CGATCGGCAC ACTCAGCGGG 
23521 TTCATCACCG TAATTTCACT TTCCGCTTCG CTGGGCTCTT CCTCTTCCTC TTGCGTCCGC 
23581 ATACCACGCG CCACTGGGTC GTCTTCATTC AGCCGCCGCA CTGTGCGCTT ACCTCCTTTG 
23641 CCATGCTTGA TTAGCACCGG TGGGTTGCTG AAACCCACCA TTTGTAGCGC CACATCTTCT 
23701 CTTTCTTCCT CGCTGTCCAC GATTACCTCT GGTGATGGCG GGCGGTCGGG CTTGGGAGAA 
237 61 GGGCGCTTCT TTTTCTTCTT GGGCGCAATG GCCAAATCCG CCGCCGAGGT CGATGGCCGC 
23821 GGGCTGGGTG TGCGCGGCAC CAGCGCGTCT TGTGATGAGT CTTCCTCGTC CTCGGACTCG 
23881 ATACGCCGCC TCATCCGCTT TTTTGGGGGC GCCCGGGGAG GCGGCGGCGA CGGGGACGGG 
23941 GACGACACGT CCTCCATGGT TGGGGGACGT CGCGCCGCAC CGCGTCCGCG CTCGGGGGTG 
24001 GTTTCGCGCT GCTCCTCTTC CCGACTGGCC ATTTCCTTCT CCTATAGGCA GAAAAAGATC 
24061 ATGGAGTCAG TCGAGAAGAA GGACAGCCTA ACCGCCCCCT CTGAGTTCGC CACCACCGCC 
24121 TCCACCGATG CCGCCAACGC GCCTACCACC TTCCCCGTCG AGGCACCCCC GC TTGAGG AG 
24181 GAGGAAGTGA TTATCGAGCA GGACCCAGGT TTTGTAAGCG AAGACGACGA GGACCGCTCA 
24241 GTACCAACAG AGGATAAAAA GCAAGACCAG GACAACGCAG AGGCAAACGA GGAACAAGTC 
24301 GGGCGGGGGG ACGAAAGGCA TGGCGACTAC CTAGATGTGG GAGACGACGT GCTGTTGAAG 
243 61 CATCTGCAGC GCCAGTGCGC CATTATCTGC GACGCGTTGC AAGAGCGCAG CGATGTGCCC 
24421 CTCGCCATAG CGGATGTCAG CCTTGCCTAC GAACGCCACC TATTCTCACC GCGCGTACCC 
24481 CCCAAACGCC AAGAAAACGG CACATGCGAG CCCAACCCGC GCCTCAACTT CTACCCCGTA 
24541 TTTGCCGTGC CAGAGGTGCT TGCCACCTAT CACATCTTTT TCCAAAACTG CAAGATACCC 
24601 CTATCCTGCC GTGCCAACCG CAGCCGAGCG GACAAGCAGC TGGCCTTGCG GCAGGGCGCT 
24661 GTCATACCTG ATATCGCCTC GCTCAACGAA GTGCCAAAAA TCTTTGAGGG TCTTGGACGC 
24721 GACGAGAAGC GCGCGGCAAA CGCTCTGCAA CAGGAAAACA GCGAAAATGA AAGTCACTCT 
247 81 GGAGTGTTGG TGGAACTCGA GGGTGACAAC GCGCGCCTAG CCGTACTAAA ACGCAGCATC 
24841 GAGGTCACCC ACTTTGCCTA CCCGGCACTT AACCTACCCC CCAAGGTCAT GAGCACAGTC 
24901 ATGAGTGAGC TGATCGTGCG CCGTGCGCAG CCCCTGGAGA GGGATGCAAA TTTGCAAGAA 
249 61 CAAACAGAGG AGGGCCTACC CGCAGTTGGC GACGAGCAGC TAGCGCGCTG GCTTCAAACG 
25021 CGCGAGCCTG CCGACTTGGA GGAGCGACGC AAACTAATGA TGGCCGCAGT GCTCGTTACC 
25081 GTGGAGCTTG AGTGCATGCA GCGGTTCTTT GCTGACCCGG AGATGCAGCG CAAGCTAGAG 
25141 GAAACATTGC ACTACACCTT TCGACAGGGC TACGTACGCC AGGCCTGCAA GATCTCCAAC 
252 01 GTGGAGCTCT GCAACCTGGT CTCCTACCTT GGAATTTTGC ACGAAAACCG CCTTGGGCAA 
25261 AACGTGCTTC ATTCCACGCT CAAGGGCGAG GCGCGCCGCG ACTACGTCCG CGACTGCGTT 
25321 TACTTATTTC TATGCTACAC CTGGCAGACG GCCATGGGCG TTTGGCAGCA GTGCTTGGAG 
25381 GAGTGCAACC TCAAGGAGCT GCAGAAACTG CTAAAGCAAA ACTTGAAGGA CCTATGGACG 
25441 GCCTTCAACG AGCGCTCCGT GGCCGCGCAC CTGGCGGACA TCATTTTCCC CGAACGCCTG 
25501 CTTAAAACCC TGCAACAGGG TCTGCCAGAC TTCACCAGTC AAAGCATGTT GCAGAACTTT 
255 61 AGGAACTTTA TCCTAGAGCG CTCAGGAATC TTGCCCGCCA CCTGCTGTGC AC TTCCTAGC 
25621 GACTTTGTGC C C ATTAAGT A CCGCGAATGC CCTCCGCCGC TTTGGGGCCA CTGCTACCTT 
25681 CTGCAGCTAG CCAACTACCT TGCCTACCAC TCTGACATAA TGGAAGACGT GAGCGGTGAC 
25741 GGTCTACTGG AGTGTCACTG TCGCTGCAAC CTATGCACCC CGCACCGCTC CCTGGTTTGC 
258 01 AATTCGCAGC TGCTTAACGA AAGTCAAATT ATCGGTACCT TTGAGCTGCA GGGTCCCTCG 
25861 CCTGACGAAA AGTCCGCGGC TCCGGGGTTG AAACTCACTC CGGGGCTGTG GACGTCGGCT 
25921 TACCTTCGCA AATTTGTACC TGAGGACTAC CACGCCCACG AGATTAGGTT CTACGAAGAC 
25981 CAATCCCGCC C GCC AAATGC GGAGCTTACC GCCTGCGTCA TTACCCAGGG CCACATTCTT 
26041 GGCCAATTGC AAGCCATCAA CAAAGCCCGC CAAGAGTTTC TGCTACGAAA GGGACGGGGG 
26101 GTTTACTTGG ACCCCCAGTC CGGCGAGGAG CTCAACCCAA TCCCCCCGCC GCCGCAGCCC 
26161 TATCAGCAGC AGCCGCGGGC CCTTGCTTCC CAGGATGGCA CCCAAAAAGA AGCTGCAGCT 
26221 GCCGCCGCCA CCCACGGACG AGGAGGAATA CTGGGACAGT CAGGCAGAGG AGGTTTTGGA 
262 81 CGAGGAGGAG GAGGACATGA TGGAAGACTG GGAGAGCCTA GACGAGGAAG CTTCCGAGGT 
26341 CGAAGAGGTG TCAGACGAAA CACCGTCACC CTCGGTCGCA TTCCCCTCGC CGGCGCCCCA 
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26401 GAAATCGGCA ACCGGTTCCA GCATGGCTAC AACCTCCGCT CCTCAGGCGC CGCCGGCACT 
26461 GCCCGTTCGC CGACCCAACC GTAGATGGGA CACCACTGGA ACCAGGGCCG GTAAGTCCAA 
26521 GCAGCCGCCG CCGTTAGCCC AAGAGCAACA ACAGCGCCAA GGCTACCGCT CATGGCGCGG 
26581 GCACAAGAAC GCCATAGTTG CTTGCTTGCA AGACTGTGGG GGCAACATCT CCTTCGCCCG 
26641 CCGCTTTCTT CTCTACCATC ACGGCGTGGC CTTCCCCCGT AACATCCTGC ATTACTACCG 
2 6701 TCATCTCTAC AGCCCATACT GCACCGGCGG CAGCGGCAGC GGCAGCAACA GCAGCGGCCA 
26761 CACAGAAGCA AAGGCGACCG GATAGCAAGA CTCTGACAAA GCCCAAGAAA TCCACAGCGG 
2 6821 CGGCAGCAGC AGGAGGAGGA GCGCTGCGTC TGGCGCCCAA CGAACCCGTA TCGACCCGCG 
26881 AGC TT AG AAA CAGGATTTTT CCCACTCTGT ATGCTATATT TCAACAGAGC AGGGGCCAAG 
26941 AAGAAGAGCT GAAAATAAAA AACAGGTCTC TGCGATCCCT CACCCGCAGC TGCCTGTATC 
27001 ACAAAAGCGA AGATCAGCTT CGGCGCACGC TGGAAGACGC GGAGGCTCTC TTCAGTAAAT 
27061 ACTGCGCGCT GACTCTTAAG GACTAGTTTC GCGCCCTTTC TCAAATTTAA GCGCGAAAAC 
27121 TACGTCATCT GCAGCGGCCA CACCCGGCGC CAGCACCTGT CGTCAGCGCC ATTATGAGCA 
27181 AGGAAATTCC CACGCCCTAC ATGTGGAGTT ACCAGCCACA AATGGGACTT GCGGCTGGAG 
27241 CTGCCCAAGA CTACTCAACC CGAATAAACT ACATGAGCGC GGGACCCCAC ATGATATCCC 
27301 GGGTCAACGG AATCCGCGCC CACCGAAACC GAATTCTCTT GGAACAGGCG GCTATTACCA 
27361 CCACACCTCG TAATAACCTT AATCCCCGTA GTTGGCCCGC TGCCCTGGTG T AC C AGGAAA 
27421 GTCCCGCTCC CACCACTGTG GTACTTCCCA GAGACGCCCA GGCCGAAGTT CAGATGACTA 
27481 ACTCAGGGGC GCAGCTTGCG GGCGGCTTTC GTCACAGGGT GCGGTCGCCC GGGCAGGGTA 
27541 TAACTCACCT GACAATCAGA GGGCGAGGTA TTCAGCTCAA CGACGAGTCG GTGAGCTCCT 
27601 CGCTTGGTCT CCGTCCGGAC GGGACATTTC AGATCGGCGG CGCCGGCCGT CCTTCATTCA 
27661 CGCCTCGTCA GGCAATCCTA ACTCTGCAGA CCTCGTCCTC TGAGCCGCGC TCTGGAGGCA 
27721 TTGGAACTCT GC AATTT ATT GAGGAGTTTG TGCCATCGGT CTACTTTAAC CCCTTCTCGG 
27781 GACCTCCCGG CCACTATCCG GATCAATTTA TTCCTAACTT TGACGCGGTA AAGGACTCGG 
27841 CGGACGGCTA CGACTGAATG TTAAGTGGAG AGGCAGAGCA ACTGCGCCTG AAACACCTGG 
27901 TCCACTGTCG CCGCCACAAG TGCTTTGCCC GCGACTCCGG TGAGTTTTGC TACTTTGAAT 
27961 TGCCCGAGGA TCATATCGAG GGCCCGGCGC ACGGCGTCCG GCTTACCGCC CAGGGAGAGC 
28021 TTGCCCGTAG CCTGATTCGG GAGTTTACCC AGCGCCCCCT GCTAGTTGAG CGGGACAGGG 
28081 GACCCTGTGT TCTCACTGTG ATTTGCAACT GTCCTAACCT TGGATTACAT CAAGATCTTT 
28141 GTTGCCATCT CTGTGCTGAG TATAATAAAT ACAGAAATTA AAATATACTG GGGCTCCTAT 
28201 CGCCATCCTG TAAACGCCAC CGTCTTCACC CGCCCAAGCA AACCAAGGCG AACCTTACCT 
28261 GGTACTTTTA ACATCTCTCC CTCTGTGATT TACAACAGTT TCAACCCAGA CGGAGTGAGT 
28321 CTACGAGAGA ACCTCTCCGA GCTCAGCTAC TCCATCAGAA AAAACACCAC CCTCCTTACC 
28381 TGCCGGGAAC GTACGAGTGC GTCACCGGCC GCTGCACCAC ACCTACCGCC TGACCGTAAA 
28441 CCAGACTTTT TCCGGACAGA CCTCAATAAC TCTGTTTACC AGAACAGGAG GTGAGCTTAG 
28501 AAAACCCTTA GGGTATTAGG CCAAAGGCGC AGCTACTGTG GGGTTTATGA ACAATTCAAG 
28561 CAACTCTACG GGCTATTCTA ATTCAGGTTT CTCTAGAATC GGGGTTGGGG TTATTCTCTG 
28621 TCTTGTGATT CTCTTTATTC TT AT AC T AAC GCTTCTCTGC CTAAGGCTCG CCGCCTGCTG 
28681 TGTGCACATT TGCATTTATT GTCAGCTTTT TAAACGCTGG GGTCGCCACC CAAGATGATT 
28741 AGGTACATAA TCCTAGGTTT ACTCACCCTT GCGTCAGCCC ACGGTACCAC CCAAAAGGTG 
28801 GATTTTAAGG AGCCAGCCTG TAATGTTACA TTCGCAGCTG AAGCTAATGA GTGCACCACT 
28861 CTTATAAAAT GCACCACAGA ACATGAAAAG CTGCTTATTC GCCACAAAAA CAAAATTGGC 
28921 AAGTATGCTG TTTATGCTAT TTGGCAGCCA GGTGACACTA CAGAGTATAA TGTTACAGTT 
28981 TTCCAGGGTA AAAGTCATAA AACTTTTATG TATACTTTTC CATTTTATGA AATGTGCGAC 
29041 ATTACCATGT ACATGAGCAA ACAGTATAAG TTGTGGCCCC CACAAAATTG TGTGGAAAAC 
29101 ACTGGCACTT TCTGCTGCAC TGCTATGCTA ATTACAGTGC TCGCTTTGGT CTGTACCCTA 
29161 CTCTATATTA AATACAAAAG CAGACGCAGC TTTATTGAGG AAAAGAAAAT GCCTT AATTT 
29221 ACTAAGTTAC AAAGCTAATG TCACCACTAA CTGCTTTACT CGCTGCTTGC AAAACAAATT 
29281 CAAAAAGTTA GCATTATAAT TAGAATAGGA TTTAAACCCC CCGGTCATTT CCTGCTCAAT 
29341 ACCATTCCCC TGAACAATTG ACTCTATGTG GGATATGCTC CAGCGCTACA ACCTTGAAGT 
29401 CAGGCTTCCT GGATGTCAGC ATCTGACTTT GGCCAGCACC TGTCCCGCGG ATTTGTTCCA 
29461 GTCCAACTAC AGCGACCCAC CCTAACAGAG ATGACCAACA CAACCAACGC GGCCGCCGCT 
29521 ACCGGACTTA CATCTACCAC AAATACACCC CAAGTTTCTG CCTTTGTCAA TAACTGGGAT 
29581 AACTTGGGCA TGTGGTGGTT CTCCATAGCG CTTATGTTTG TATGC CTT AT TATTATGTGG 
29641 CTCATCTGCT GCCTAAAGCG CAAACGCGCC CGACCACCCA TCTATAGTCC CATCATTGTG 
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29701 CTACACCCAA ACAATGATGG AATCCATAGA TTGGACGGAC TGAAACACAT GTTCTTTTCT 
29761 CTTACAGTAT GATTAAATGA GACATGATTC CTCGAGTTTT TATATTACTG ACCCTTGTTG 
29821 CGCTTTTTTG TGCGTGCTCC ACATTGGCTG CGGTTTCTCA CATCGAAGTA GACTGCATTC 
29881 CAGCCTTCAC AGTCTATTTG CTTTACGGAT TTGTCACCCT CACGCTCATC TGCAGCCTCA 
29941 TCACTGTGGT CATCGCCTTT ATCCAGTGCA TTGACTGGGT CTGTGTGCGC TTTGCATATC 
30001 TCAGACACCA TCCCCAGTAC AGGGACAGGA CTATAGCTGA GCTTCTTAGA ATTCTTTAAT 
30061 TATGAAATTT ACTGTGACTT TTCTGCTGAT TATTTGCACC CTATCTGCGT TTTGTTCCCC 
30121 GACCTCCAAG CCTCAAAGAC ATATATCATG CAGATTCACT CGTATATGGA ATATTCCAAG 
30181 TTGCTACAAT GAAAAAAGCG ATCTTTCCGA AGCCTGGTTA TATGCAATCA TCTCTGTTAT 
30241 GGTGTTCTGC AGTACCATCT TAGCCCTAGC TATATATCCC TACCTTGACA TTGGCTGGAA 
303 01 ACGAATAGAT GCCATGAACC ACCCAACTTT CCCCGCGCCC GCTATGCTTC CACTGCAACA 
30361 AGTTGTTGCC GGCGGCTTTG TCCCAGCCAA TCAGCCTCGC CCCACTTCTC CCACCCCCAC 
30421 TGAAATCAGC TACTTTAATC TAACAGGAGG AGATGACTGA CACCCTAGAT CTAGAAATGG 
30481 AC GGAATT AT TACAGAGCAG CGCCTGCTAG AAAGACGCAG GGCAGCGGCC GAGCAACAGC 
30541 GCATGAATCA AGAGCTCCAA GACATGGTTA ACTTGCACCA GTGCAAAAGG GGTATCTTTT 
30601 GTCTGGTAAA GCAGGCCAAA GTCACCTACG ACAGTAATAC CACCGGACAC CGCCTTAGCT 
30661 AC AAGTTGC C AACCAAGCGT CAGAAATTGG TGGTCATGGT GGGAGAAAAG CCCATTACCA 
30721 TAACTCAGCA CTCGGTAGAA ACCGAAGGCT GCATTCACTC ACCTTGTCAA GGACCTGAGG 
30781 ATCTCTGCAC CCTTATTAAG ACCCTGTGCG GTCTCAAAGA TCTTATTCCC TTTAACTAAT 
3 0841 AAAAAAAAAT AATAAAGCAT CACTTACTTA AAATCAGTTA GCAAATTTCT GTCCAGTTTA 
30901 TTCAGCAGCA CCTCCTTGCC CTCCTCCCAG CTCTGGTATT GCAGCTTCCT CCTGGCTGCA 
30961 AACTTTCTCC ACAATCTAAA TGGAATGTCA GTTTCCTCCT GTTCCTGTCC ATCCGCACCC 
31021 ACTATCTTCA TGTTGTTGCA GATGAAGCGC GCAAGACCGT CTGAAGATAC CTTCAACCCC 
31081 GTGTATCCAT ATGACACGGA AACCGGTCCT CCAACTGTGC CTTTTCTTAC TCCTCCCTTT 
31141 GTATCCCCCA ATGGGTTTCA AGAGAGTCCC CCTGGGGTAC TCTCTTTGCG CCTATCCGAA 
31201 CCTCTAGTTA CCTCCAATGG CATGCTTGCG CTCAAAATGG GCAACGGCCT CTCTCTGGAC 

312 61 GAGGCCGGCA ACCTTACCTC CCAAAATGTA ACCACTGTGA GCCCACCTCT CAAAAAAACC 
31321 AAGTCAAACA TAAACCTGGA AATATCTGCA CCCCTCACAG TTACCTCAGA AGCCCTAACT 

313 81 GTGGCTGCCG CCGCACCTCT AATGGTCGCG GGCAACACAC TCACCATGCA ATCACAGGCC 
31441 CCGCTAACCG TGC AC GACTC CAAACTTAGC ATTGCCACCC AAGGACCCCT CACAGTGTCA 
31501 GAAGGAAAGC TAGCCCTGCA AACATCAGGC CCCCTCACCA CCACCGATAG CAGTACCCTT 
31561 ACTATCACTG CCTCACCCCC TCTAACTACT GCCACTGGTA GCTTGGGCAT TGACTTGAAA 
31621 GAGCCCATTT ATACACAAAA TGGAAAACTA GGACTAAAGT ACGGGGCTCC TTTGCATGTA 
31681 ACAGACGACC TAAACACTTT GACCGTAGCA ACTGGTCCAG GTGTGACTAT TAATAATACT 
31741 TCCTTGCAAA CTAAAGTTAC TGGAGCCTTG GGTTTTGATT CACAAGGCAA TATGCAACTT 
31801 AATGTAGCAG GAGGACTAAG GATTGATTCT CAAAACAGAC GCCTTATACT TGATGTTAGT 
318 61 TATCCGTTTG ATGCTCAAAA CCAACTAAAT CTAAGACTAG GACAGGGCCC TCTTTTTATA 
31921 AACTCAGCCC ACAACTTGGA TATTAACTAC AACAAAGGCC TTTACTTGTT TACAGCTTCA 
31981 AACAATTCCA AAAAGCTTGA GGTTAACCTA AGCACTGCCA AGGGGTTGAT GTTTGACGCT 
32 041 ACAGCCATAG CCATTAATGC AGGAGATGGG CTTGAATTTG GTTCACCTAA TGCACCAAAC 
32101 ACAAATCCCC TCAAAACAAA AATTGGCCAT GGCCTAGAAT TTGATTCAAA CAAGGCTATG 

^ 32161 GTTCCTAAAC TAGGAACTGG CCTTAGTTTT GACAGCACAG GTGCCATTAC AGTAGGAAAC 
32221 AAAAATAATG ATAAGCTAAC TTTGTGGACC ACACCAGCTC CATCTCCTAA CTGTAGACTA 
32281 AATGCAGAGA AAGATGCTAA ACTCACTTTG GTCTTAACAA AATGTGGCAG TCAAATACTT 
32341 GCTACAGTTT CAGTTTTGGC TGTTAAAGGC AGTTTGGCTC CAATATCTGG AACAGTTCAA 
32401 AGTGCTCATC TTATTATAAG ATTTGACGAA AATGGAGTGC TACTAAACAA TTCCTTCCTG 
32461 GACCCAGAAT ATTGGAACTT TAGAAATGGA GATCTTACTG AAGGCACAGC CTATACAAAC 
32521 GCTGTTGGAT TTATGCCTAA CCTATCAGCT TATCCAAAAT CTCACGGTAA AACTGCCAAA 
32581 AGTAACATTG TCAGTCAAGT TTACTTAAAC GGAGACAAAA CTAAACC TGT AACACTAACC 
32641 ATTACACTAA ACGGTACACA GGAAACAGGA GACACAACTC CAAGTGCATA CTCTATGTCA 
327 01 TTTTCATGGG ACTGGTCTGG CCACAACTAC ATTAATGAAA TATTTGCCAC ATCCTCTTAC 
32761 ACTTTTTCAT ACATTGCCCA AGAATAAAGA ATCGTTTGTG TTATGTTTCA ACGTGTTTAT 
32821 TTTTCAATTG CAGAAAATTT CAAGTCATTT TTCATTCAGT AGTATAGCCC CACCACCACA 
32881 TAGCTTATAC AGATCACCGT ACCTTAATCA AACTCACAGA ACCCTAGTAT TCAACCTGCC 
32941 ACCTCCCTCC CAACACACAG AGTACACAGT CCTTTCTCCC CGGCTGGCCT TAAAAAGCAT 
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33001 CATATCATGG GTAACAGACA TATTCTTAGG TGTTATATTC CACACGGTTT CCTGTCGAGC 
33061 CAAACGCTCA TCAGTGATAT TAATAAACTC CCCGGGCAGC TCACTTAAGT TCATGTCGCT 
33121 GTCCAGCTGC TGAGCCACAG GCTGCTGTCC AACTTGC GGT TGCTTAACGG GCGGCGAAGG 
33181 AGAAGTCCAC GCCTACATGG GGGTAGAGTC ATAATCGTGC ATCAGGATAG GGCGGTGGTG 
33241 CTGCAGCAGC GCGC GAATAA ACTGCTGCCG CCGCCGCTCC GTCCTGCAGG AATACAACAT 
33301 GGCAGTGGTC TCCTCAGCGA TGATTCGCAC CGCCCGCAGC ATAAGGCGCC TTGTCCTCCG 
33361 GGCACAGCAG CGCACCCTGA TCTCACTTAA ATCAGCACAG TAACTGCAGC ACAGCACCAC 
33421 AATATTGTTC AAAATCCCAC AGTGCAAGGC GCTGTATCCA AAGCTCATGG CGGGGACCAC 
33481 AGAACCCACG TGGC CATC AT ACCACAAGCG CAGGTAGATT AAGTGGCGAC CCCTCATAAA 
33541 CACGCTGGAC ATAAACATTA CCTCTTTTGG CATGTTGTAA TTCACCACCT CCCGGTACCA. 
33601 TATAAACCTC TGATTAAACA TGGCGCCATC CACCACCATC CTAAACCAGC TGGCCAAAAC 
33661 CTGCCCGCCG GCTATAC AC T GCAGGGAACC GGGACTGGAA CAATGACAGT GGAGAGCCCA 
33721 GGACTCGTAA CCATGGATCA TCATGCTCGT CATGATATCA ATGTTGGCAC AAC ACAGGC A 
33781 CACGTGCATA CACTTCCTCA GGATTACAAG CTCCTCCCGC GTTAGAACCA TATCCC AGGG 
33841 AACAACCCAT TCCTGAATCA GCGTAAATCC C AC AC TGC AG GGAAGACCTC GCACGTAACT 
33901 CACGTTGTGC ATTGTCAAAG TGTTACATTC GGGCAGCAGC GGATGATCCT CCAGTATGGT 
33961 AGCGCGGGTT TCTGTCTCAA AAGGAGGTAG ACGATCCCTA CTGTACGGAG TGCGCCGAGA 
34021 CAACCGAGAT CGTGTTGGTC GTAGTGTCAT GCCAAATGGA ACGCCGGACG TAGTC ATATT 
34081 TCCTGAAGCA AAACCAGGTG CGGGCGTGAC AAACAGATCT GCGTCTCCGG TCTCGCCGCT 
34141 TAGATCGCTC TGTGTAGTAG TTGTAGTATA TCCACTCTCT CAAAGCATCC AGGCGCCCCC 
34201 TGGCTTCGGG TTCTATGTAA ACTCCTTCAT GCGCCGCTGC CCTGATAACA TCCACCACCG 
34261 CAGAATAAGC CACACCCAGC CAACCTACAC ATTCGTTCTG CGAGTCACAC ACGGGAGGAG 
34321 CGGGAAGAGC TGGAAGAACC ATGTTTTTTT TTTTATTCCA AAAGATTATC CAAAACCTCA 
34381 AAATGAAGAT CTATTAAGTG AACGCGCTCC CCTCCGGTGG CGTGGTCAAA CTCTACAGCC 
34441 AAAGAACAGA TAATGGCATT TGTAAGATGT TGCACAATGG CTTCCAAAAG GCAAACGGCC 
34501 CTCACGTCCA AGTGGACGTA AAGGCTAAAC CCTTCAGGGT GAATCTCCTC TATAAACATT 
34561 CCAGCACCTT CAACCATGCC CAAATAATTC TCATCTCGCC ACCTTCTCAA TATATCTCTA 
34621 AGCAAATCCC GAATATTAAG TCCGGCCATT GTAAAAATCT GCTCCAGAGC GCCCTCCACC 
34681 TTCAGCCTCA AGCAGCGAAT CATGATTGCA AAAATTCAGG TTCCTCACAG ACCTGTATAA 
34741 GATTCAAAAG CGGAACATTA ACAAAAATAC CGCGATCCCG TAGGTCCCTT CGCAGGGCCA 
34801 GCTGAACATA ATCGTGCAGG TCTGCACGGA CCAGCGCGGC CACTTCCCCG CCAGGAACCT 
34861 TGACAAAAGA ACCCACACTG ATTATGACAC GCATACTCGG AGC TATGCTA ACCAGCGTAG 
34921 CCCCGATGTA AGCTTTGTTG CATGGGCGGC GATATAAAAT GCAAGGTGCT GCTCAAAAAA 
34981 TCAGGCAAAG CCTCGCGCAA AAAAGAAAGC ACATCGTAGT CATGCTCATG CAGATAAAGG 
35041 CAGGTAAGCT CCGGAACCAC CACAGAAAAA GACACCATTT TTCTCTCAAA CATGTCTGCG 
35101 GGTTTCTGCA TAAACACAAA ATAAAATAAC AAAAAAACAT TTAAACATTA GAAGCCTGTC 
35161 TTACAACAGG AAAAACAACC CTTATAAGCA TAAGACGGAC TACGGCCATG CCGGCGTGAC 
35221 CGTAAAAAAA CTGGTCACCG TGATTAAAAA GCACCACCGA CAGCTCCTCG GTCATGTCCG 
3 5281 GAGTCATAAT GTAAGACTCG GTAAACACAT CAGGTTGATT CATC GGTC AG TGCTAAAAAG 
35341 CGACCGAAAT AGCCCGGGGG AATACATACC CGCAGGCGTA GAGACAACAT TACAGCCCCC 
35401 ATAGGAGGTA TAACAAAATT AATAGGAGAG AAAAACACAT AAACACCTGA AAAACCCTCC 
35461 TGCCTAGGCA AAATAGCACC CTCCCGCTCC AGAACAACAT ACAGCGCTTC ACAGCGGCAG 
35521 CCTAACAGTC AGCCTTACCA GTAAAAAAGA AAACCTATTA AAAAAACACC AC TCGAC ACG 
35581 GCACCAGCTC AATCAGTCAC AGTGTAAAAA AGGGCCAAGT GCAGAGCGAG TATATATAGG 
35641 ACTAAAAAAT GACGTAACGG TTAAAGTCCA CAAAAAACAC CCAGAAAACC GCACGCGAAC 
35701 CTACGCCCAG AAACGAAAGC CAAAAAACCC ACAACTTCCT CAAATCGTCA CTTCCGTTTT 
35761 CCCACGTTAC GTAACTTCCC ATTTTAAGAA AACTACAATT CCCAACACAT ACAAGTTACT 
35821 CCGCCCTAAA ACCTACGTCA CCCGCCCCGT TCCCACGCCC CGCGCCACGT CACAAACTCC 
35881 ACCCCCTCAT TATCATATTG GCTTCAATCC AAAATAAGGT ATATTATTGA TGATG 
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Western blot on whole-cell extracts from 293 cells transfected with plasmid DNA expressing the 
different HCV NS cassettes. Mature NS3 and NS5A products were detected with specific antibodies. 
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IFNy ELIspot on splenocytes from C57black6 mice immunized with two injections of 25p,g DNA/dose with 
GET of plasmid vectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC 
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IFNy ELIspot on splenocytes from BalbC mice immunized with two injections of 50|ig DNA/dose with 
GET of plasmid vectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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Western blot on whole-cell extracts from HeLa cells infected at different multiplicity of infection 
(m.o.i.; indicated at the top) with Adenovectors expressing the different HCV NS cassettes. Mature 
NS5B and NSSA products were detected with specific antibodies. 
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IFNy ELISPOT on splenocytes from C57black6 mice immunized with two injections of 10 9 vp/dose of 
Aderiovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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IFNy ELISPOT on PBMC from Rhesus monkeys immunized with one injection of 10 10 vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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IFNy ELISPOT on PBMC from Rhesus monkeys immunized with one injection of 10 10 vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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IFNy ELISPOT on PBMC from Rhesus monkeys immunized with two injections of 10 u vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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EFNy ELISPOT on PBMC from Rhesus monkeys immunized with two injections of 10 u vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10? PBMC. 
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IFN7 ICS on PBMC from Rhesus monkeys immunized with two injections at four weeks interval with 10 10 vp/dose 
of Adenovectors expressing the different HCV NS cassettes. Data are expressed as number of positive 
IFNy/CD3/CD8 per 10 6 lymphocytes. 
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IFNy ICS on PBMC from Rhesus monkeys immunized with two injections at four weeks interval with 10 1 1 vp/dose 
of Adenovectors expressing the different HCV NS cassettes. Data are expressed as number of positive 
IFNY/CD3/CD8 per 10 6 lymphocytes. _ _ 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 1 1 vp/dose of Ad5-NS. 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 ll vp/dose of Ad5-NS. 



FIG. 18B 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 n vp/dose of MRKAdS-NSmut. 
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Buik CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 u vp/dose of MRKAdS-NSiimt 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 n vp/dose of MRKAd6-NSmut. 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 n vp/dose of MRKAd6-NSmui. 



FIG. 1 8F 



WO 03/031588 



PCT/US02/32512 




FIG. 19 



WO 03/031588 



PCT/US02/32512 



89/92 



1 


GCCACCATGG 


CCCCCATCAC 


CGCCTACAGC 


CAGCAGACCA 


GGGGCCTGCT 


51 


GGGCTGCATC 


ATCACCAGCC 


TGACCGGACG. CGACAAGAAC 


CAGGTGGAGG 


101 


GAGAGGTGCA 


GGTGGTGAGC 


APPGPTAPCP 


AGAGCTTCCT 


GGCCACCTGC 


151 


GTGAACGGCG 


TGTGCTGGAC 


PGTGTAPPAP 


GGAGCCGGAA 


GCAAGACCCT 


201 


GGCCGGACCC 


AAGGGCCCTA 


TP AC PC AGAT 


GTACACCAAT 


GTGGATCAGG 


251 


ATCTGGTGGG 


CTGGCAGGCC 


PPTPPPGGAG 


CCAGGAGCCT 


GACACCCTGT 


301 


ACCTGTGGAA 


GCAGCGACCT 


GTAPC TGGTG 


ACACGCCACG 


CC G ATGTGAT 


351 


CCCCGTGAGG 


CGCAGGGGCG 


ATTPTPGPGG 


AAGPCTGCTG 


AGCCCTAGGC 


401 


CCGTGAGCTA 


CCTGAAGGGC 


AGP AGPGGAG 


GAPPPPTGPT 


GTGTCCTTCT 


451 


GGCCATGCCG 


TGGGCATTTT 


TP GPGP TGPP 


GTGTGTAPP A 


GGGGCGTGGC 


501 


CAAAGCCGTG 


GATTTTGTGC 


P P GTGG A A AG 


P ATGG AGAP P 


ACCATGCGCA 


551 


GCCCTGTGTT 


CACCGACAAC 


AGPTP TCCCC 


CTGCCGTGCC 


CCAATCATTC 


601 


CAGGTGGCTC 


ACCTGCACGC 


CCCTACCGGA 


TCTGGCAAGA 


GCACCAAGGT 


651 


GCCCGCTGCC 


TACGCCGCTC 


AGGGC TAC AA 


GGTGCTGGTG 


CTGAACCCCA 


701 


GCGTGGCCGC 


TACCCTGGGC 


TTCGGCGCTT 


ACATGAGCAA 


GGCCCATGGC 


751 


ATCGACCCCA 


ACATCCGCAC 


AGGCGTGCGC 


ACCATCACCA 


CCGGAGCTCC 


801 


CGTGACCTAC 


AGCACCTACG 


GCAAGTTCCT 


GGCCGATGGA 


GGCTGCAGCG 


851 


GAGGAGCCTA 


CGACATCATC 


ATCTGCGACG 


AGTGCCACAG 


CACCGACAGC 


901 


ACCACCATCC 


TGGGCATTGG 


CACCGTGCTG 


GATCAGGCCG 


AAACAGCTGG 


951 


AGCCAGGCTG 


GTGGTGCTGG 


CCACAGCTAC 


CCCTCCTGGC 


AGCGTGACCG 


1001 


TGCCCCATCC 


CAATATCGAG 


GAGGTGGCCC 


TGAGCAACAC 


AGGCGAGATC 


1051 


CCCTTCTACG 


GC AAGGCCAT 


CCCCATCGAG 


GCCATCCGCG 


GAGGCAGGCA 


1101 


CCTGATCTTC 


TGCCACAGCA 


AGAAGAAGTG 


CGACGAGCTG 


GCTGCCAAGC 


1151 


TGAGCGGACT 


GGGCATCAAC 


GCCGTGGCCT 


ACTACAGGGG 


CCTGGACGTG 


1201 


TCAGTGATCC 


CCACCATCGG 


CGATGTGGTG 


GTGGTGGCCA 


CCGACGCCCT 


1251 


GATGACAGGC 


TACACCGGAG 


ACTTCGACAG 


CGTGATCGAC 


TGCAACACCT 


1301 


GCGTGACCCA 


GACCGTGGAC 


TTCAGCCTGG 


ACCCCACCTT 


CACCATCGAA 


1351 


ACCACCACCG 


TGCCTCAGGA 


TGCTGTGAGC 


AGGAGCCAGA 


GGCGCGGAGG 


1401 


CACCGGAAGG 


GGCAGGCGCG 


GAATTTATCG 


CTTTGTGACC 


CCTGGCGAAA 


1451 


GGCCCTCTGG 


CATGTTCGAC 


AGCAGCGTGC 


TGTGCGAGTG 


CTACGAGGCT 


1501 


GGCTGCGGTf 


GGTACGAGCT 


GACACCCGCT GAAACCAGCG TGCGCCTGCG 


1551 


CGCTTATCTG 


AATACCCCTG 


GCCTGCCCGT 


GTGTCAGGAC 


CAGCTGGAGT 
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1601 


TCTGGGAGAG 


CGTGTTC ACA 


GGACTGACCC 


ACATCGACGC 


CCATTTCCTG 


1651 


AGCCAGACCA 


AGC AGGCTGG 


CGACAACTTC 


CCCTATCTGG 


TGGCCTATCA 


1701 


GGCCACCGTG 


TGTGCTAGGG 


CCCAAGCTCC 


ACCTCCTTCA 


TGGGACCAGA 


1751 


TGTGGAAGTG 


CCTGATCCGC 


CTGAAGCCCA 


CCCTGCACGG 


CCdTACCCCT 


1801 


CTGCTGTACC 


GCCTGGGAGC 


CGTGCAGAAC 


GAGGTGACCC 


TGACCCACCC 


1851 


CATCACCAAG 


TACATCATGG 


CGTGCATGAG 


CGCTGATCTG 


GAAGTGGTGA 


1901 


CCAGCACCTG 


GGTGCTGGTG 


GGAGGCGTGC 


TGGCCGCTCT 


GGCTGCCTAC 


1951 


TGCCTGACCA 


CCGGAAGCGT 


GGTGATCGTG 


GGACGCATCA 


TCCTGAGCGG 


2001 


AAGGCCCGCT 


ATCGTGCCCG 


ATCGCGAGTT 


CCTGTACCAG 


GAGTTCGACG 


2051 


AGATGGAGGA 


GTGTGCCAGC 


CACCTGCCCT 


ACATCGAGCA 


GGGCATGCAG 


2101 


CTGGCCGAAC 


AGTTCAAGCA 


GAAGGCCCTG 


GGCCTGCTGC 


AGACAGCCAC 


2151 


CAAACAGGCC 


GAAGCTGCCG 


CTCCCGTGGT 


GGAAAGCAAG 


TGGAGGGCCC 


2201 


TGGAGACCTT 


CTGGGCTAAG 


CACATGTGGA 


ACTTCATCTC 


TGGCATCCAG 


2251 


TACCTGGCCG 


GACTGAGCAC 


CCTGCCTGGC 


AACCCCGCTA 


TCGC C AGCCT 


2301 


GATGGCCTTC 


ACCGCTAGCA 


TCACCTCTCC 


CCTGACCACC 


CAGAGCACCC 


2351 


TGCTGTTCAA 


CATTCTGGGC 


GGATGGGTGG 


CCGCTCAGCT 


GGCCCCTCCT 


2401 


TCAGCTGCTT 


CTGCC TTTGT 


GGGCGCTGGC 


ATTGCCGGAG 


CCGCTGTGGG 


2451 


C AGCATTGGC , 


CTGGGCAAAG 


TGCTGGTGGA 


TATTCTGGCT 


GGCTATGGCG 


2501 


CTGGCGTGGC 


CGGAGCCCTG 


GTGGCCTTCA 


AGGTGATGAG 


CGGAGAGATG 


2551 


CCCAGCACCG 


AGGACCTGGT 


GAACCTGCTG 


CCTGCCATTC 


TGAGCCCTGG 


2601 


AGCCCTGGTG 


GTGGGCGTGG 


TGTGTGCTGC 


CATTCTGAGG 


CGCCATGTGG 


2651 


GACCCGGAGA 


GGGCGCTGTG 


CAGTGGATGA 


ACCGCCTGAT 


CGCCTTCGCC 


2701 


TCTCGCGGAA 


ACC AC GTGAG 


CCCTACCCAC 


TACGTGCCTG 


AGAGCGACGC 


2751 


CGCTGCCAGG 


GTGACCCAGA 


TCCTGAGCAG 


CCTGACCATC 


ACCCAGCTGC 


2801 


TGAAGCGCCT 


GCACCAGTGG 


ATCAACGAGG 


ACTGCAGCAC 


ACCCTGCAGC 


2851 


GGAAGCTGGC 


TGAGGGACGT 


GTGGGACTGG 


ATCTGCACCG 


TGCTGACCGA 


2901 


CTTCAAGACC 


TGGCTGCAGA 


GCAAGCTGCT 


GCCCCAACTG 


CCTGGCGTGC 


2951 


CCTTCTTCTC 


ATGCCAGCGC 


GGATACAAGG 


GCGTGTGGAG 


GGGCGATGGC 


3001 


ATCATGCAGA 


CCACCTGTCC 


CTGCGGAGCC 


CAGATCACAG 


GCCACGTGAA 


3051 


GAACGGCAGC 


ATGCGCATCG 


TGGGCCCTAA 


GACCTGCAGC 


AACACCTGGC 


3101 


ACGGCACCTT 


CCCCATCAAC 


GCCTACACCA 


CCGGACCCTG 


CACACCCAGC 


3151 


CCTGCTCCCA 


ACTACAGCAG 


GGCCCTGTGG 


AGGGTGGCTG 


CCGAGGAGTA 
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3201 


CGTGGAGGTG 


ACCAGGGTGG 


GAGACTTCCA 


CTACGTGACC 


GGAATGACCA 


3251 


CCGACAACGT 


GAAGTGTCCC 


TGTCAGGTGC 


CCGCTCCCGA 


ATTTTTTACC 


3301 


GAAGTGGATG 


GCGTGCGCCT 


GCATCGCTAT 


GCCCCTGCCT 


GTAGGCCCCT 


3351 


GCTGCGCGAA 


GAAGTGACCT 


TCCAGGTGGG 


CCTGAACCAG 


TACCTGGTGG 


3401 


GCAGCCAGCT 


GCCCTGCGAG 


CCTGAGCGCG 


ATGTGGCCGT 


GCTGACCAGC 


3451 


ATGCTGACCG 


ACCCCAGCCA 


CATCAGAGCC 


GAAACCGCTA 


AAAGGCGCCT 


3501 


GGCCAGGGGC 


TCTCCTCCAA 


GCCTGGCCTC 


AAGCAGCGCT 


AGGCAGCTGT 


3551 


CTGCTCCCAG 


CCTGAAGGCC 


ACCTGCAGCA 


CCCACCACGT 


GAGCCCCGAC 


3601 


GCGGACCTGA 


TCGAGGCCAA 


CCTGCTGTGG 


CGCCAGGAGA 


TGGGCGGCAA 


3651 


CATCACGCGC 


GTGGAGAGCG 


AGAACAAGGT 


GGTGGTGGTG 


GACAGCTTCG 


3701 


ACCCCCTGCG 


CGCCGAGGAG 


GACGAGCGCG 


AGGTGAGCGT 


GCCCGCCGAG 


3751 


ATCCTGCGCA 


AGAGC AAGAA 


GTTCCCCGCT 


GCCATGCCCA 


TCTGGGCTAG 


3801 


ACCTGATTAG 


AACCCTCCCG 


TGCTGG AGAG 


CTGGAAGGAC 


GCTGATTACG 


3851 


TGCCTCCAGT 


GGTGC ATGGG 


TGTCCTCTGC 


CTCCCATTAA 


AGCCCCTCCT 


3901 


ATTCCACCTC 


CTAGGCGCAA 


AAGGACCGTG 


GTGCTGACAG 


AAAGCAGCGT 


3951 


GAGCTCTGCT 


CTGGCCGAAC 


TGGCCACCAA 


GACCTTTGGC 


AGCAGCGAGA 


4001 


GCTCTGCCGT 


GGACAGCGGA 


AGAGCCACCG 


CTCTGCCTGA . 


CCAGGCCAGC 


4051 


GACGACGGCG 


ATAAGGGCAG 


CGATGTGGAG 


AGCTATAGCA 


GCATGCCTCC 


4101 


CCTGGAAGGC 


GAACCTGGCG 


ATCCCGATCT 


GAGCGATGGC 


AGCTGGAGCA 


4151 


CCGTGAGCGA 


AGAGGCCAGC 


GAGGACGTGG 


TGTGTTGCAG 


CATGAGCTAC 


4201 


ACCTGGACAG 


GCGCTCTGAT 


CACACCCTGC 


GCTGCCGAGG 


AGAGC AAGCT 


4251 


GCCCATCAAC 


GCCCTGAGCA 


ACAGCCTGCT 


GAGGGACCAC 


AACATGGTGT 


4301 


ACGCCACCAC 


CAGCAGGTCT 


GCCGGACTGA ? GGCAGAAGAA 


GGTGACCTTC 


4351 


GACCGCCTGC 


AGGTGCTGGA 


CGACCACTAC 


CGCGATGTGC 


TGAAGGAGAT 


4401 


GAAGGCCAAG 


GCCAGCACCG 


TGAAGGCCAA 


GCTGCTGAGC 


GTGGAGGAGG 


4451 


CCTGCAAGCT 


GACCCCCCCC 


CACAGCGCCA 


AGAGCAAGTT 


CGGCTACGGC 


4501 


GCCAAGGACG 


TGCGCAACCT 


GAGCAGCAAG 


GCCGTGAACC 


ACATCCACAG 


4551 


CGTGTGGAAG 


GACC TGCTGG 


AGGACACCGT 


GACCCCCATC 


GACACCACCA 


4601 


TCATGGCCAA 


GAACGAGGTG 


TTCTGCGTGC 


AGCCCGAGAA 


GGGCGGCCGG 


4651 


AAGCCCGCTC 


GCCTGATCGT 


GTTCCCCGAT 


CTGGGCGTGC 


GCGTGTGCGA 


4701 


GAAGATGGCC 


CTGTACGACG 


TGGTGAGCAC 


CCTGCCTCAG 


GTGGTGATGG 


4751 


GCTCAAGCTA 


CGGCTTCCAG 


TACAGCCCTG 


GCCAGCGCGT 


GGAGTTCCTG 
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4801 


GTGAACACCT 


GGAAGAGCAA 


GAAGAACCCC 


ATGGGCTTCA 


GCTACGACAC 


4851 


ACGCTGCTTC 


GACAGCACCG 


TGACCGAGAA 


CGACATCCGC 


GTGGAGGAGA 


4901 


GCATCTACCA 


GTGCTGCGAC 


CTGGCCCCTG 


AGGCCAGGCA 


GGCCATCAAG 


4951 


AGCCTGACCG 


AGCGCCTGTA 


CATCGGAGGC 


CCTCTGACCA 


ACAGCAAGGG 


5001 


ACAGAACTGC 


GGATACAGGC 


GCTGTAGGGC 


CTCTGGCGTG 


CTGACCACCA 


5051 


GCTGTGGCAA 


CACCCTGACC 


TGCTACCTGA 


AGGCCAGCGC 


TGCCTGTCGC 


5101 


GCTGCCAAGC 


TGCAGGACTG 


CACCATGCTG 


GTGAACGCCG 


CTGGCCTGGT 


5151 


GGTGATTTGT 


GAAAGCGCTG 


GCACCCAGGA 


AGATGCTGCC 


AGCCTGCGCG 


5201 


TGTTCACCGA 


GGCCATGACC 


AGGTACTCTG 


CCCCTCCCGG 


AGACCCCCCT 


5251 


CAGCCCGAAT 


ACGACCTGGA 


GCTGATCACC 


AGCTGCTCAA 


GCAACGTGAG 


5301 


CGTGGCTCAC 


GACGCCAGCG 


GAAAGCGCGT 


GTACTACCTG 


ACACGCGATC 


5351 


CCACCACCCC 


TCTGGCTCGC 


GCTGCCTGGG 


AAACCGCTCG 


CCATACACCC 


5401 


GTGAACAGCT 


GGCTGGGCAA 


CATCATCATG 


TACGCCCCTA 


CCCTGTGGGC 


5451 


TCGCATGATC 


CTGATGACCC 


ACTTCTTCAG 


CATCCTGCTG 


GCTCAGGAGC 


5501 


AGCTGGAGAA 


GGCCCTGGAC 


TGCCAGATTT 


ACGGCGCTTG 


CTACAGCATC 


5551 


GAGCCCCTGG 


ACCTGCCCCA 


AATCATCGAG 


CGCCTGCACG 


GCCTGTCTGC 


5601 


CTTCAGCCTG 


CACAGCTACA 


GCCCTGGCGA 


AATTAATCGC 


GTGGCCAGCT 


5651 


GTCTGCGCAA 


ACTGGGCGTG 


CCTCCTCTGC 


GCGTGTGGAG 


GCATAGGGCT 


5701 


AGGAGCGTGA 


GGGCTAGGCT 


GCTGAGCCAG 


GGAGGCAGGG 


CCGCTACCTG 


5751 


TGGAAAGTAC 


CTGTTCAACT 


GGGCCGTGAA 


GACCAAGCTG 


AAGCTGACCC 


5801 


CTATCCCTGC 


CGCTAGCCAG 


CTGGACCTGA 


GCGGATGGTT 


CGTGGGTGGC 


5851 


TACAGCGGAG 


GCGACATCTA 


CCACAGCCTG 


TCTCGCGCTC 


GCCCTCGCTG 


5901 


GTTCATGCTG 


TGCCTGCTGC 


TGCTGAGCGT 


GGGCGTGGGC 


ATCTACCTGC 


5951 


TGCCCAACCG 


CTAAA 
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<110> Merck & Co. Inc., and Istituto Di Ricerche Di Biologia Molecolare P, 
Angeletti S.P.A. 

<120> HEPATITIS C VIRUS VACCINE 

<130> ITR0015Y 

<150> 60/363,774 
<151> 2002-03-13 

<150> 60/328,655 
<151> 2001-10-11 

<160> 17 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 1985 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide 



<400> 1 



Met 


Ala 


Pro 


He 


Thr 


Ala 


Tyr 


Ser 


Gin 


Gin 


Thr 


Arg 


Gly 


Leu 


Leu 


Gly 


1 








5 










10 










15 




Cys 


He 


lie 


Thr 
20 


Ser 


Leu 


Thr 


Gly 


Arg 
25 


Asp 


Lys 


Asn 


Gin 


Val 
30 


Glu 


Gly 


Glu 


Val 


Gin 
35 


Val 


Val 


Ser 


Thr 


Ala 
40 


Thr 


Gin 


Ser 


Phe 


Leu 
45 


Ala 


Thr 


Cys 


Val 


Asn 


Gly 


Val 


Cys 


Trp 


Thr 


Val 


Tyr 


His 


Gly Ala 


Gly 


Ser 


Lys 


Thr 




50 










55 










60 










Leu 


Ala 


Gly 


Pro 


Lys 


Gly 


Pro 


He 


Thr 


Gin 


Met 


Tyr 


Thr 


Asn 


Val 


Asp 


65 










70 










75 










80 


Gin 


Asp 


Leu 


Val 


Gly 


Trp 


Gin 


Ala 


Pro 


Pro 


Gly Ala 


Arg 


Ser 


Leu 


Thr 










85 










90 










95 




Pro 


Cys 


Thr 


Cys 
100 


Gly 


Ser 


Ser 


Asp 


Leu 
105 


Tyr 


Leu 


Val 


Thr 


Arg 
110 


His 


Ala 




Val 


He 


Pro 


Val 


Arg 


Arg 


Arg 


Gly 


Asp 


Ser 


Arg 


Gly 


Ser 


Leu 


Leu 




115 










120 










125 








Ser 


Pro 


Arg 


Pro 


Val 


Ser 


Tyr 


Leu 


Lys 


Gly 


Ser 


Ser 


Gly Gly 


Pro 


Leu 




130 










135 










140 










Leu 


Cys 


Pro 


Ser 


Gly 


His 


Ala 


Val 


Gly 


He 


Phe 


Arg 


Ala 


Ala 


Val 


Cys 


145 








150 










155 










160 


Thr Arg 


Gly 


Val 


Ala 


Lys 


Ala 


Val 


Asp 


Phe 


Val 


Pro 


Val 


Glu- 


Ser 


Met 










165 










170 










175 




Glu 


Thr 


Thr 


Met 
180 


Arg 


Ser 


Pro 


Val 


Phe 
185 


Thr 


Asp 


Ash 


Ser 


Ser 
190 


Pro 


Pro 


Ala 


Val 


Pro 


Gin 


Ser 


Phe 


Gin 


Val 


Ala 


His 


Leu 


His 


Ala 


Pro 


Thr 


Gly 



195 200 205 
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Ser 


Glv 


Lys 


Ser 




210 






Lys 


Val 


Leu 


Val 


225 








Ala 


Tyr 


Met 


Ser 


Val 




Thr 


He 








260 




Phe 


Leu 


Ala 






275 




lie 


Cys 


Asp 


Glu 




290 








Thr 


Val 


Leu 


305 








Leu 


Ala 


Thr 


Ala 


lie 


Glu 


Glu 


Val 








340 


Lys 


Ala 


He 


Pro 






355 




Cys 


His 


Ser 


Lys 




370 






Leu 


Glv 


He 


Ash. 


385 








He 


Pro 


Thr 


He 


Thr 


Gly 


Tyr 


Thr 








420 


Val 


Thr 


Gin 


Thr 






435 




Thr 


Thr 


Thr 


Val 




450 






Arcr 


Thr 


Glv 


Arcr 


465 








Glu 


Arg 


Pro 


Ser 


Asp 


Ala 


Gly 


Cys 








500 


Arg 


Leu 


Arg 


Ala 






515 




His 


Leu 


Glu 


Phe 




530 






Ala 


His 


Phe 


Leu 


545 








Leu 


Val 


Ala 


Tyr 


Pro 


Ser 


Trp 


Asp 








580 


Leu 


His 


Gly 


Pro 






595 




Glu 


Val 


Thr 


Leu 




610 






Ser 


Ala 


Asp 


Leu 



Thr 


Lys 


Val 


Pro 






215 




Leu 


Asn 


Pro 


Ser 




230 






Lys 


Ala 


His 


Gly 


245 








Thr 


Thr Gly 


Ala 


Asd 


Gly Gly 


Cys 








280 


Cys 


His 


Ser 


Thr 






295 




Asp 


Gin 


Ala 


Glu 




310 






Thr 


Pro 


Pro 


Gly 


325 








Ala 


Leu 


Ser 


Asn 


He 


Glu 


Ala 


He 








360 


Lys 


Lys Cys 


Asp 






375 




Ala 


Val 


Ala 


Tyr 




390 






Gly 


Asp Val 


Val 


405 








Gly 


Asp 


Phe 


Asp 


Val 


Asp 


Phe 


Ser 








440 


Pro 


Glh 


Asp 


Ala 






455 




Gly 


Arg Arg 


Gly 




470 






Gly 


Met 


Phe 


Asp 


485 








Ala 


Trp 


Tyr 


Glu 


Tyr 


Leu 


Asn 


Thr 








520 


Trp 


Glu 


Ser 


Val 






535 




Ser 


Gin 


Thr 


Lys 




550 






Gin 


Ala 


Thr 


Val 


565 








Gin 


Met 


Trp 


Lys 


Thr 


Pro 


Leu 


Leu 








600 


Thr 


His 


Pro 


He 






615 




Glu 


Val 


Val 


Thr 




630 







Ala 


Ala 


Tyr 


Ala 








220 


Val 


Ala 


Ala 


Thr 






235 




He Asp 


Pro 


Asn 




250 






Pro 


Val 


Thr 


Tyr 


265 








Ser Gly 


Gly Ala 


Asp 


Ser 


Thr 


Thr 








300 


Thr 


Ala 


Gly Ala 






315 




Ser Val 


Thr 


Val 




330 






Thr Gly 


Glu 


He 


345 








Arg Gly 


Gly Arg 


Glu 


Leu 


Ala 


Ala 








380 


Tyr Arg 


Gly Leu 






395 




Val 


Val 


Ala 


Thr 




410 






Ser 


Val 


He 


Asp 


425 








Leu 


Asp 


Pro 


Thr 


Val 


Ser 


Arg 


Ser 








460 


He 


Tyr 


Arg 


Phe 






475 




Ser 


Ser 


Val 


Leu 




490 






Leu 


Thr 


Pro 


Ala 


505 








Pro Gly 


Leu 


Pro 


Phe 


Thr 


Gly 


Leu 








540 


Gin 


Ala 


Gly Asp 






555 




Cys 


Ala 


Arg Ala 




570 






Cys 


Leu 


He 


Arg 


585 








Tyr 


Arg 


Leu 


Gly 


Thr 


Lys 


Tyr 


He 








620 


Ser 


Thr 


Trp Val 






635 





Ala Gin Gly Tyr 

Leu Gly Phe Gly 
240 

lie Arg Thr Gly 
255 

Ser Thr Tyr Gly 
270 

Tyr Asp He He 
285 

He Leu Gly He 

Arg Leu Val Val 
320 

Pro His Pro Asn 
335 

Pro Phe Tyr Gly 
350 

His Leu He Phe 
365 

Lys Leu Ser Gly 

Asp Val Ser Val 
400 

Asp Ala Leu Met 
415 

Cys Asn Thr Cys 
430 

Phe Thr lie Glu 
445 

Gin Arg Arg Gly 

Val Thr Pro Gly 
480 

Cys Glu Cys Tyr 
495 

Glu Thr Ser Val 
.510 

Val Cys Gin Asp 
525 

Thr His He Asp 

Asn Phe Pro Tyr 
560 

Gin Ala Pro Pro 
575 

Leu Lys Pro Thr 
590 

Ala Val Glri Ash 
605 

Met Ala Cys Met 

Leu Val Gly Gly 
640 
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Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val 

645 650 655 

lie Val Gly Arg lie lie Leu Ser Gly Arg Pro Ala lie Val Pro Asp 

660 665 670 

Arg Glu Phe Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser 

675 680 685 

His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys 

690 695 700 

Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala 
705 710 715 720 

Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp 

725 730 735 

Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly 

740 745 750 

Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe 

755 760 765 

Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe 

770 775 780 

Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala 
785 790 795 800 

Ala Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser 

805 810 815 

lie Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala 

820 825 830 

Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met 

835 840 845 

Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro 

850 855 860 

Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His 
865 870 875 880 

Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala 

885 890 895 

Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu 

900 905 910 

Ser Asp Ala Ala Ala Arg Val Thr Gin He Leu Ser Ser Leu Thr He 

915 920 925 

Thr Gin Leu Leu Lys Arg Leu His Gin Trp He Asn Glu Asp Cys Ser 

930 935 940 

Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp He Cys 
945 950 955 960 

Thr Val Leu Thr Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu Leu Pro 

965 970 975 

Gin Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr Lys Gly 

- 980 985 990 

Val Trp Arg Gly Asp Gly He Met Gin Thr Thr Cys Pro Cys Gly Ala 

995 1000 1005 

Gin He Thr Gly His Val Lys Asn Gly Ser Met Arg He Val Gly Pro 

1010 1015 1020 

Lys Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro He Asn Ala Tyr 
1025 1030 1035 1040 

Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser Arg Ala 

1045 1050 1055 

Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr Arg Val Gly 
1060 1065 1070 
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Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys Cys Pro 

1075 1080 1085 

Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Val Asp Gly Val Arg 

1090 1095 1100 

Leu His Arg Tyr Ala Pro Ala Cys Arg Pro Leu Leu Arg Glu Glu Val 
1105 1110 1115 1120 

Thr Phe Gin Val Gly Leu Asn Gin Tyr Leu Val Gly Ser Gin Leii Pro 

1125 1130 1135 

Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp 

1140 1145 1150 

Pro Ser His lie Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly 

1155 1160 1165 

Ser Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro 

1170 1175 1180 

Ser Leu Lys Ala Thr Cys Thr Thr His His Val Ser Pro Asp Ala Asp 
1185 1190 1195 1200 

Leu lie Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie 

1205 1210 1215 

Thr Arg Val Glu Ser Glu Asn Lys Val Val Val Leu Asp Ser Phe Asp 

1220 1225 1230 

Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu 

1235 1240 1245 

lie Leu Arg Lys Ser Lys Lys Phe Pro Ala Ala Met Pro He Trp Ala 

1250 1255 1260 

Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asp Pro Asp 
1265 1270 1275 1280 

Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro He Lys Ala 

1285 1290 1295 

Pro Pro He Pro Pro Pro Arg Arg Lys Arg Thr- Val Val Leu Thr Glu 

1300 1305 1310 

Ser Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly 

1315 1320 1325 

Ser Ser Glu Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Leu Pro 

1330 1335 1340 

Asp Gin Ala Ser Asp Asp Gly Asp Lys Gly Ser Asp Val Glu Ser Tyr 
1345 1350 1355 1360 

Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser 

1365 1370 1375 

Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val 

1380 1385 1390 

Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu He Thr Pro Cys 

1395 1400 1405 

Ala Ala Glu Glu Ser Lys Leu Pro He Asn Ala Leu Ser Asn Ser Leu 

1410 1415 1420 

Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Gly 
1425 1430 1435 1440 

Leu Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp 

1445 1450 1455 

His Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val 

1460 1465 1470 

Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro 

1475 1480 1485 

His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Ash 
1490 1495 1500 
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Leu Ser Ser Lys Ala Val Asn His lie His Ser Val Trp Lys Asp Leu 
1505 1510 1515 1520 

Leu Glu Asp Thr Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn 

1525 1530 1535 

Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 

1540 1545 1550 

Leu lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala 

1555 1560 1565 

Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Val Val Met Gly Ser Ser 

1570 1575 1580 

Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 
1585 1590 1595 1600 

Thr Trp Lys Ser Lys Lys Asn Pro Met Gly Phe Ser Tyr Asp Thr Arg 

1605 1610 1615 

Cys Phe Asp Ser Thr Val Thr Glu Asn Asp lie Arg Val Glu Glu Ser 

1620 1625 1630 

lie Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala lie Lys 

1635 1640 1645 

Ser Leu Thr Glu Arg Leu Tyr lie Gly Gly Pro Leu Thr Asn Ser Lys 

1650 1655 1660 

Gly Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr 
1665 1670 1675 1680 

Thr Ser Cys, Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala 

1685 1690 1695 

Cys Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Asn Ala Ala 

1700 1705 1710 

Gly Leu Val Val lie Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala 

1715 1720 1725 

Ser Leu Arg Val Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro 

1730 1735 1740 

Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys 
1745 1750 1755 1760 

Ser Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val Tyr 

1765 1770 1775 

Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu 

1780 1785 1790 

Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn lie lie Met 

1795 1800 1805 

Tyr Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe 

1810 1815 1820 

Ser He Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala Leu Asp Cys Gin 
1825 1830 1835 1840 

He Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Gin He 

1845 1850 1855 

He Glu Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser 

I860 1865 1870 

Pro Gly Glu He Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly Val 

1875 1880 1885 

Pro Pro Leu Arg Val Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg 

1890 1895 1900 

Leu Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly Lys Tyr Leu Phe 
1905 1910 1915 1920 

Asn Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro He Pro Ala Ala 
1925 1930 1935 
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Ser Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly Gly 

1940 1945 1950 

Asp lie Tyr His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Met Leu 

1955 I960 1965 

Cys Leu Leii Leu Leu Ser Val Gly Val Gly lie Tyr Leu Leu Pro Asn 
1970 1975 1980 

Arg 
1985 

<2l6> 2 
<211> 5965 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Non-optimized cDNA sequence encoding SEQ. ID. NO. 
1 

<400> 2 

gccaccatgg cgcccatcac ggcctactcc caacagacgc ggggcctact tggttgcatc 60 

atcactagcc ttacaggccg ggacaagaac caggtcgagg gagaggttca ggtggtttcc 120 

accgcaacac aatccttcct ggcgacctgc gtcaacggcg tgtgttggac cgtttaccat 180 

ggtgctggct caaagacctt agccggccca aaggggccaa tcacccagat gtacactaat 240 

gtggaccagg acctcgtcgg ctggcaggcg ccccccgggg cgcgttcctt gacaccatgc 300 

acctgtggca gctcagacct ttacttggtc acgagacatg ctgacgtcat tccggtgcgc 360 

cggcggggcg acagtagggg gagcctgctc tcccccaggc ctgtctccta cttgaagggc 420 

tcttcgggtg gtccactgct ctgcccttcg gggcacgctg tgggcatctt ccgggctgcc 480 

gtatgcaccc ggggggttgc gaaggcggtg gactttgtgc ccgtagagtc catggaaact 540 

actatgcggt ctccggtctt cacggacaac tcatcccccc cggccgtacc gcagtcattt 600 

caagtggccc acctacacgc tcccactggc agcggcaaga gtactaaagt gccggctgca 660 

tatgcagccc aagggtacaa ggtgctcgtc ctcaatccgt ccgttgccgc taccttaggg 720 

tttggggcgt atatgtctaa ggcacacggt attgacccca acatcagaac tggggtaagg 780 

accattacca caggcgcccc cgtcacatac tctacctatg gcaagtttct tgccgatggt 840 

ggttgctctg ggggcgctta tgacatcata atatgtgatg agtgccattc aactgactcg 900 

actacaatct tgggcatcgg cacagtcctg gaccaagcgg agacggctgg agcgcggctt 960 

gtcgtgctcg ccaccgctac gcctccggga tcggtcaccg tgccacaccc aaacatcgag 1020 

gaggtggccc tgtctaatac tggagagatc cccttctatg gcaaagccat ccccattgaa 1080 

gccatcaggg ggggaaggca tctcattttc tgtcattcca agaagaagtg cgacgagctc 1140 

gccgcaaagc tgtcaggcct cggaatcaac gctgtggcgt attaccgggg gctcgatgtg 1200 

tccgtcatac caactatcgg agacgtcgtt gtcgtggcaa cagacgctct gatgacgggc 12 60 

tatacgggcg actttgactc agtgatcgac tgtaacacat gtgtcaccca gacagtcgac 1320 

ttcagcttgg atcccacctt caccattgag acgacgaccg tgcctcaaga cgcagtgtcg 1380 

cgctcgcagc ggcggggtag gactggcagg ggtaggagag gcatctacag gtttgtgact 1440 

ccgggagaac ggccctcggg catgttcgat tcctcggtcc tgtgtgagtg ctatgacgcg 1500 

ggctgtgctt ggtacgagct cacccccgcc gagacctcgg ttaggttgcg ggcctacctg 1560 

aacacaccag ggttgcccgt ttgccaggac cacctggagt tctgggagag tgtcttcaca 1620 

ggcctcaccc acatagatgc acacttcttg tcccagacca agcaggcagg agacaacttc 1680 

ccctacctgg tagcatacca agccacggtg tgcgccaggg ctcaggcccc acctccatca 1740 

tgggatcaaa tgtggaagtg tctcatacgg ctgaaaccta cgctgcacgg gccaacaccc 1800 

ttgctgtaca ggctgggagc cgtccaaaat gaggtcaccc tcacccaccc cataaccaaa 1860 

tacatcatgg catgcatgtc ggctgacctg gaggtcgtca ctagcacctg ggtgctggtg 1920 

ggcggagtcc ttgcagctct ggccgcgtat tgcctgacaa caggcagtgt ggtcattgtg 1980 

ggtaggatta tcttgtccgg gaggccggct attgttcccg acagggagtt tctctaccag 2040 

gagttcgatg aaatggaaga gtgcgcctcg cacctccctt acatcgagca gggaatgcag 2100 

ctcgccgagc aattcaagca gaaagcgctc gggttactgc aaacagccac caaacaagcg 2160 
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gaggctgctg ctcccgtggt ggagtccaag tggcgagccc ttgagacatt ctgggcgaag 2220 

cacatgtgga atttcatcag cgggatacag tacttagcag gcttatccac tctgcctggg 2280 

aaccccgcaa tagcatcatt gatggcattc acagcctcta tcaccagccc gctcaccacc 2340 

caaagtaccc tcctgtttaa catcttgggg gggtgggtgg ctgcccaact cgcccccccc 2400 

agcgccgctt cggctttcgt gggcgccggc atcgccggtg cggctgttgg cagcataggc 2460 

cttgggaagg tgcttgtgga cattctggcg ggttatggag caggagtggc cggcgcgctc 2520 

gtggccttca aggtcatgag cggcgagatg ccctccaccg aggacctggt caatctactt 2580 

cctgccatcc tctctcctgg cgccctggtc gtcggggtcg tgtgtgcagc aatactgcgt 2640 

cgacacgtgg gtccgggaga gggggctgtg cagtggatga accggctgat agcgttcgcc 21700 

tcgcggggta atcatgtttc ccccacgcac tatgtgcctg agagcgacgc cgcagcgcgt 2760 

gttactcaga tcctctccag ccttaccatc actcagctgc tgaaaaggct ccaccagtgg 2820 

attaatgaag actgctccac accgtgttcc ggctcgtggc taagggatgt ttgggactgg 2880 

atatgcacgg tgttgactga cttcaagacc tggctccagt ccaagctcct gccgcagcta 2940 

ccgggagtcc cttttttctc gtgccaacgc gggtacaagg gagtctggcg gggagacggc 3000 

atcatgcaaa ccacctgccc atgtggagca cagatcaccg gacatgtcaa aaacggttcc 3060 

atgaggatcg tcgggcctaa gacctgcagc aacacgtggc atggaacatt ccccatcaac 3120 

gcatacacca cgggcccctg cacaccctct ccagcgccaa actattctag ggcgctgtgg 3180 

cgggtggccg ctgaggagta cgtggaggtc acgcgggtgg gggatttcca ctacgtgacg 3240 

ggcatgacca ctgacaacgt aaagtgccca tgccaggttc cggctcctga attcttcacg 3300 

gaggtggacg gagtgcggtt gcacaggtac gctccggcgt gcaggcctct cctacgggag 3360 

gaggttacat tccaggtcgg gctcaaccaa tacctggttg ggtcacagct accatgcgag 3420 

cccgaaccgg atgtagcagt gctcacttcc atgctcaccg acccctccca catcacagca 3480 

gaaacggcta agcgtaggtt ggccaggggg tctcccccct ccttggccag ctcttcagct 3540 

agccagttgt ctgcgccttc cttgaaggcg acatgcacta cccaccatgt ctctccggac 3600 

gctgacctca tcgaggccaa cctcctgtgg cggcaggaga tgggcgggaa catcacccgc 3660 

gtggagtcgg agaacaaggt ggtagtcctg gactctttcg acccgcttcg agcggaggag 3720 

gatgagaggg aagtatccgt tccggcggag atcctgcgga aatccaagaa gttccccgca 3780 

gcgatgccca tctgggcgcg cccggattac aaccctccac tgttagagtc ctggaaggac 3840 

ccggactacg tccctccggt ggtgcacggg tgcccgttgc cacctatcaa ggcccctcca 3900 

ataccacctc cacggagaaa gaggacggtt gtcctaacag agtcctccgt gtcttctgcc 3960 

ttagcggagc tcgctactaa gaccttcggc agctccgaat catcggccgt cgacagcggc 4020 

acggcgaccg cccttcctga ccaggcctcc gacgacggtg acaaaggatc cgacgttgag 4080 

tcgtactcct ccatgccccc ccttgagggg gaaccggggg accccgatct cagtgacggg 4140 

tcttggtcta ccgtgagcga ggaagctagt gaggatgtcg tctgctgctc aatgtcctac 4200 

acatggacag gcgccttgat cacgccatgc gctgcggagg aaagcaagct gcccatcaac 42 60 

gcgttgagca actctttgct gcgccaccat aacatggttt atgccacaac atctcgcagc 4320 

gcaggcctgc ggcagaagaa ggtcaccttt gacagactgc aagtcctgga cgaccactac 4380 

cgggacgtgc tcaaggagat gaaggcgaag gcgtccacag ttaaggctaa actcctatcc 4440 

gtagaggaag cctgcaagct gacgccccca cattcggcca aatccaagtt tggctatggg 4500 

gcaaaggacg tccggaacct atccagcaag gccgttaacc acatccactc cgtgtggaag 4560 

gacttgctgg aagacactgt gacaccaatt gacaccacca tcatggcaaa aaatgaggtt 4620 

ttctgtgtcc aaccagagaa aggaggccgt aagccagccc gccttatcgt attcccagat 4680 

ctgggagtcc gtgtatgcga gaagatggcc ctctatgatg tggtctccac ccttcctcag 4740 

gtcgtgatgg gctcctcata cggattccag tactctcctg ggcagcgagt cgagttcctg 4800 

gtgaatacct ggaaatcaaa gaaaaacccc atgggctttt catatgacac tcgctgtttc 4860 

gactcaacgg tcaccgagaa cgacatccgt gttgaggagt caatttacca atgttgtgac 4920 

ttggcccccg aagccagaca ggccataaaa tcgctcacag agcggcttta tatcgggggt 4980 

cctctgacta attcaaaagg gcagaactgc ggttatcgcc ggtgccgcgc gagcggcgtg 5040 

ctgacgacta gctgcggtaa caccctcaca tgttacttga aggcctctgc agcctgtcga 5100 

gctgcgaagc tccaggactg cacgatgctc gtgaacgccg ccggccttgt cgttatctgt 5160 

gaaagcgcgg gaacccaaga ggacgcggcg agcctacgag tcttcacgga ggctatgact 5220 

aggtactctg ccccccccgg ggacccgccc caaccagaat acgacttgga gctgataaca 5280 

tcatgttcct ccaatgtgtc ggtcgcccac gatgcatcag gcaaaagggt gtactacctc 5340 

acccgtgatc ccaccacccc cctcgcacgg gctgcgtggg aaacagctag acacactcca 5400 

gttaactcct ggctaggcaa cattatcatg tatgcgccca ctttgtgggc aaggatgatt 5460 
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ctgatgactc acttcttctc catccttcta gcacaggagc aacttgaaaa agccctggac 5520 

tgccagatct acggggcctg ttactccatt gagccacttg acctacctca gatcattgaa 5580 

cgactccatg gccttagcgc attttcactc catagttact ctccaggtga gatcaatagg 5640 

gtggcttcat gcctcaggaa acttggggta ccacccttgc gagtctggag acatcgggcc 5700 

aggagcgtcc gcgctaggct actgtcccag ggggggaggg ccgccacttg tggcaagtac 57 60 

ctcttcaact gggcagtgaa gaccaaactc aaactcactc caatcccggc tgcgtcccag 5820 

ctggacttgt ccggctggtt cgttgctggt tacagcgggg gagacatata tcacagcctg 5880 

tctcgtgccc gaccccgctg gttcatgctg tgcctactcc tactttctgt aggggtaggc 5940 

atctacctgc tccccaaccg ataaa 5965 

<210> 3 
<211> 5965 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Optimized cDNA encoding SEQ ID NO: 1 
<400> 3 

gccaccatgg cccccatcac cgcctacagc cagcagaccc gcggcctgct gggctgcatc 60 

atcaccagcc tgaccggccg cgacaagaac caggtggagg gcgaggtgca ggtggtgagc 120 

accgccaccc agagcttcct ggccacctgc gtgaacggcg tgtgctggac cgtgtaccac 180 

ggcgccggca gcaagaccct ggccggcccc aagggcccca tcacccagat gtacaccaac 240 

gtggaccagg acctggtggg ctggcaggcc ccccccggcg cccgcagcct gaccccctgc 300 

acctgcggca gcagcgacct gtacctggtg acccgccacg ccgacgtgat ccccgtgcgc 360 

cgccgcggcg acagccgcgg cagcctgctg agcccccgcc ccgtgagcta cctgaagggc 420 

agcagcggcg gccccctgct gtgccccagc ggccacgccg tgggcatctt ccgcgccgcc 480 

gtgtgcaccc gcggcgtggc caaggccgtg gacttcgtgc ccgtggagag catggagacc 540 

accatgcgca gccccgtgtt caccgacaac agcagccccc ccgccgtgcc ccagagcttc 600 

caggtggccc acctgcacgc ccccaccggc agcggcaaga gcaccaaggt gcccgccgcc 660 

tacgccgccc agggctacaa ggtgctggtg ctgaacccca gcgtggccgc caccctgggc 720 

ttcggcgcct acatgagcaa ggcccacggc atcgacccca acatccgcac cggcgtgcgc 780 

accatcacca ccggcgcccc cgtgacctac agcacctacg gcaagttcct ggccgacggc 840 

ggctgcagcg gcggcgccta cgacatcatc atctgcgacg agtgccacag caccgacagc 900 

accaccatcc tgggcatcgg caccgtgctg gaccaggccg agaccgccgg cgcccgcctg 960 

gtggtgctgg ccaccgccac cccccccggc agcgtgaccg tgccccaccc caacatcgag 1020 

gaggtggccc tgagcaacac cggcgagatc cccttctacg gcaaggccat ccccatcgag 1080 

gccatccgcg gcggccgcca cctgatcttc tgccacagca agaagaagtg cgacgagctg 1140 

gccgccaagc tgagcggcct gggcatcaac gccgtggcct actaccgcgg cctggacgtg 1200 

agcgtgatcc ccaccatcgg cgacgtggtg gtggtggcca ccgacgccct gatgaccggc 1260 

tacaccggcg acttcgacag cgtgatcgac tgcaacacct gcgtgaccca gaccgtggac 1320 

ttcagcctgg accccacctt caccatcgag accaccaccg tgccccagga cgccgtgagc 1380 

cgcagccagc gccgcggccg caccggccgc ggccgccgcg gcatctaccg cttcgtgacc 1440 

cccggcgagc gccccagcgg catgttcgac agcagcgtgc tgtgcgagtg ctacgacgcc 1500 

ggctgcgcct ggtacgagct gacccccgcc gagaccagcg tgcgcctgcg cgcctacctg 1560 

. aacacccccg gcctgcccgt gtgccaggac cacctggagt tctgggagag cgtgttcacc 1620 

ggcctgaccc acatcgacgc ccacttcctg agccagacca agcaggccgg cgacaacttc 1680 

ccctacctgg tggcctacca ggccaccgtg tgcgcccgcg cccaggcccc cccccccagc 1740 

tgggaccaga tgtggaagtg cctgatccgc ctgaagccca ccctgcacgg ccccaccccc 1800 

ctgctgtacc gcctgggcgc cgtgcagaac gaggtgaccc tgacccaccc catcaccaag 1860 

tacatcatgg cctgcatgag cgccgacctg gaggtggtga ccagcacctg ggtgctggtg 1920 

ggcggcgtgc tggccgccct ggccgcctac tgcctgacca ccggcagcgt ggtgatcgtg 1980 

ggccgcatca tcctgagcgg ccgccccgcc atcgtgcccg accgcgagtt cctgtaccag 2040 

. gagttcgacg agatggagga gtgcgccagc cacctgccct acatcgagca gggcatgcag 2100 

ctggccgagc agttcaagca gaaggccctg ggcctgctgc agaccgccac caagcaggcc 2160 
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gaggccgccg cccccgtggt ggagagcaag tggcgcgecc tggagacctt ctgggccaag 2220 

cacatgtgga acttcatcag cggcatccag tacctggccg gcctgagcac cctgcccggc 2280 

aaccccgcca tcgccagcct gatggccttc accgccagca tcaccagccc cctgaccacc 2340 

cagagcaccc tgctgttcaa catcctgggc ggctgggtgg ccgcccagct ggcccccccc 2400 

agcgccgcca gcgccttcgt gggcgccggc atcgccggcg ccgccgtggg cagcatcggc 2460 

ctgggcaagg tgctggtgga catcctggcc ggctacggcg ccggcgtggc cggcgccctg 2520 

gtggccttca aggtgatgag cggcgagatg cccagcaccg aggacctggt gaacctgctg 2580 

cccgccatcc tgagccccgg cgccctggtg gtgggcgtgg tgtgcgccgc catcctgcgc 2640 

cgccacgtgg gccccggcga gggcgccgtg cagtggatga accgcctgat cgccttcgcc 2700 

agccgcggca accacgtgag ccccacccac tacgtgcccg agagcgacgc cgccgcccgc 2760 

gtgacccaga tcctgagcag cctgaccatc acccagctgc tgaagcgcct gcaccagtgg 2820 

atcaacgagg actgcagcac cccctgcagc ggcagctggc tgcgcgacgt gtgggactgg 2880 

atctgcaccg tgctgaccga cttcaagacc tggctgcaga gcaagctgct gccccagctg 2940 

cccggcgtgc ccttcttcag ctgccagcgc ggctacaagg gcgtgtggcg cggcgacggc 3000 

atcatgcaga ccacctgccc ctgcggcgcc cagatcaccg gccacgtgaa gaacggcagc 3060 

atgcgcatcg tgggccccaa gacctgcagc aacacctggc acggcacctt ccccatcaac 3120 

gcctacacca ccggcccctg cacccccagc cccgccccca actacagccg cgccctgtgg 3180 

cgcgtggccg ccgaggagta cgtggaggtg acccgcgtgg gcgacttcca ctacgtgacc 3240 

ggcatgacca ccgacaacgt gaagtgcccc tgccaggtgc ccgcccccga gttcttcacc 33 00 

gaggtggacg gcgtgcgcct gcaccgctac gcccccgcct gccgccccct gctgcgcgag 3360 

gaggtgacct tccaggtggg cctgaaccag tacctggtgg gcagccagct gccctgcgag 3420 

cccgagcccg acgtggccgt gctgaccagc atgctgaccg accccagcca catcaccgcc 3480 

gagaccgcca agcgccgcct ggcccgcggc agccccccca gcctggccag cagcagcgcc 3540 

agccagctga gcgcccccag cctgaaggcc acctgcacca cccaccacgt gagccccgac 3600 

gccgacctga tcgaggccaa cctgctgtgg cgccaggaga tgggcggcaa catcacccgc 3660 

gtggagagcg agaacaaggt ggtggtgctg gacagcttcg accccctgcg cgccgaggag 3720 

gacgagcgcg aggtgagcgt gcccgccgag atcctgcgca agagcaagaa gttccccgcc 3780 

gccatgccca tctgggcccg ccccgactac aacccccccc tgctggagag ctggaaggac 3840 

cccgactacg tgccccccgt ggtgcacggc tgccccctgc cccccatcaa ggcccccccc 3900 

atcccccccc cccgccgcaa gcgcaccgtg gtgctgaccg agagcagcgt gagcagcgcc 3960 

ctggccgagc tggccaccaa gaccttcggc agcagcgaga gcagcgccgt ggacagcggc 4020 

accgccaccg ccctgcccga ccaggccagc gacgacggcg acaagggcag cgacgtggag 4080 

agctacagca gcatgccccc cctggagggc gagcccggcg accccgacct gagcgacggc 4140 

agctggagca ccgtgagcga ggaggccagc gaggacgtgg tgtgctgcag catgagctac 42 00 

acctggaccg gcgccctgat caccccctgc gccgccgagg agagcaagct gcccatcaac 42 60 

gccctgagca acagcctgct gcgccaccac aacatggtgt acgccaccac cagccgcagc 4320 

gccggcctgc gccagaagaa ggtgaccttc gaccgcctgc aggtgctgga cgaccactac 4380 

cgcgacgtgc tgaaggagat gaaggccaag gccagcaccg tgaaggccaa gctgctgagc 4440 

gtggaggagg cctgcaagct gacccccccc cacagcgcca agagcaagtt cggctacggc 4500 

gccaaggacg tgcgcaacct gagcagcaag gccgtgaacc acatccacag cgtgtggaag 4560 

gacctgctgg aggacaccgt gacccccatc gacaccacca tcatggccaa gaacgaggtg 4620 

ttctgcgtgc agcccgagaa gggcggccgc aagcccgccc gcctgatcgt gttccccgac 4680 

c#gggcgtgc gcgtgtgcga gaagatggcc ctgtacgacg tggtgagcac cctgccccag 4740 

gtggtgatgg gcagcagcta cggcttccag tacagccccg gccagcgcgt ggagttcctg 4800 

gtgaacacct ggaagagcaa gaagaacccc atgggcttca gctacgacac ccgctgcttc 4860 

gacagcaccg tgaccgagaa cgacatccgc gtggaggaga gcatctacca gtgctgcgac 4920 

ctggcccccg aggcccgcca ggccatcaag agcctgaccg agcgcctgta catcggcggc 4980 

cccctgacca acagcaaggg ccagaactgc ggctaccgcc gctgccgcgc cagcggcgtg 5040 

ctgaccacca gctgcggcaa caccctgacc tgctacctga aggccagcgc cgcctgccgc 5100 

gccgccaagc tgcaggactg caccatgctg gtgaacgccg ccggcctggt ggtgatctgc 5160 

gagagcgccg gcacccagga ggacgccgcc agcctgcgcg tgttcaccga ggccatgacc 5220 

cgctacagcg ccccccccgg cgaccccccc cagcccgagt acgacctgga gctgatcacc 5280 

agctgcagca gcaacgtgag cgtggcccac gacgccagcg gcaagcgcgt gtactacctg 5340 

acccgcgacc ccaccacccc cctggcccgc gccgcctggg agaccgcccg ccacaccccc 5400 

gtgaacagct ggctgggcaa catcatcatg tacgccccca ccctgtgggc ccgcatgatc 5460 
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ctgatgaccc 
tgccagatct 
cgcctgcacg 
gtggccagct 
cgcagcgtgc 
ctgttcaact 
ctggacctga 
agccgcgccc 
atctacctgc 



acttcttcag 
acggcgcctg 
gcctgagcgc 
gcctgcgcaa 
gcgcccgcct 
gggccgtgaa 
gcggctggtt 
gcccccgctg 
tgcccaaccg 



catcctgctg 
ctacagcatc 
cttcagcctg 
gctgggcgtg 
gctgagccag 
gaccaagctg 
cgtggccggc 
gttcatgctg 
ctaaa 



<210> 4 
<211> 37090 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> MRKAd6-NSmut nucleic acid 



gcccaggagc 
gagcccctgg 
cacagctaca 
ccccccctgc 
ggcggccgcg 
aagctgaccc 
tacagcggcg 
tgcctgctgc 



agctggagaa 
acctgcccca 
gccccggcga 
gcgtgtggcg 
ccgccacctg 
ccatccccgc 
gcgacatcta 
tgctgagcgt 



ggccctggac 
gatcatcgag 
gatcaaccgc 
ccaccgcgcc 
cggcaagtac 
cgccagccag 
ccacagcctg 
gggcgtgggc 



5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
5965 



<400> 4 

catcatcaat 

ttgtgacgtg. 

gatgttgcaa 

gtgtgcgccg 

taaatttggg 

agtgaaatct 

gactttgacc 

cgggtcaaag 

catatcataa 

gattattgac 

tggagttccg 

cccgcccatt 

attgacgtca 

atcatatgcc 

atgcccagta 

tcgctattac 

actcacgggg 

aaaatcaacg 

gtaggcgtgt 

cctggagacg 

tccgcggccg 

accatggcgc 

actagcctta 

gcaacacaat 

gctggctcaa 

gaccaggacc 

tgtggcagct 

cggggcgaca 

tcgggtggtc 

tgcacccggg 

atgcggtctc 

gtggcccacc 

gcagcccaag 

ggggcgtata 

attaccacag 

tgctctgggg 



aatatacctt 
gcgcggggcg 
gtgtggcgga 
gtgtacacag 
cgtaaccgag 
gaataatttt 
gtttacgtgg 
ttggcgtttt 
tatgtacatt 
tagttattaa 
cgttacataa 
gacgtcaata 
atgggtggag 
aagtacgccc 
catgacctta 
catggtgatg 
atttccaagt 
ggactttcca 
acggtgggag 
ccatccacgc 
ggaacggtgc 
ccatcacggc 
caggccggga 
ccttcctggc 
agaccttagc 
tcgtcggctg 
cagaccttta 
gtagggggag 
cactgctctg 
gggttgcgaa 
cggtcttcac 
tacacgctcc 
ggtacaaggt 
tgtctaaggc 
gcgcccccgt 
gcgcttatga 



attttggatt 
tgggaacggg 
acacatgtaa 
gaagtgacaa 
taagatttgg 
gtgttactca 
agactcgccc 
attattatag 
tatattggct 
tagtaatcaa 
cttacggtaa 
atgacgtatg 
tatttacggt 
cctattgacg 
tgggactttc 
cggttttggc 
ctccacccca 
aaatgtcgta 
gtctatataa 
tgttttgacc 
attggaacgc 
ctactcccaa 
caagaaccag 
gacctgcgtc 
cggcccaaag 
gcaggcgccc 
cttggtcacg 
cctgctctcc 
cccttcgggg 
ggcggtggac 
ggacaactca 
cactggcagc 
gctcgtcctc 
acacggtatt 
cacatactct 
catcataata 



gaagccaata 
gcgggtgacg 
gcgacggatg 
ttttcgcgcg 
ccattttcgc 
tagcgcgtaa 
aggtgttttt 
gcggccgcga 
catgtccaac 
ttacggggtc 
atggcccgcc 
ttcccatagt 
aaactgccca 
tcaatgacgg 
ctacttggca 
agtacatcaa 
ttgacgtcaa 
acaactccgc 
gcagagctcg 
tccatagaag 
ggattccccg 
cagacgcggg 
gtcgagggag 
aacggcgtgt 
gggccaatca 
cccggggcgc 
agacatgctg 
cccaggcctg 
cacgctgtgg 
tttgtgcccg 
tcccccccgg 
ggcaagagta 
aatccgtccg 
gaccccaaca 
acctatggca 
tgtgatgagt 



tgataatgag 
tagtagtgtg 
tggcaaaagt 
gttttaggcg 
gggaaaactg 
tatttgtcta 
ctcaggtgtt 
tccattgcat 
attaccgcca 
attagttcat 
tggctgaccg 
aacgccaata 
cttggcagta 
taaatggccc 
gtacatctac 
tgggcgtgga 
tgggagtttg 
cccattgacg 
tttagtgaac 
acaccgggac 
tgccaagagt 
gcctacttgg 
aggttcaggt 
gttggaccgt 
cccagatgta 
gttccttgac 
acgtcattcc 
tctcctactt 
gcatcttccg 
tagagtccat 
ccgtaccgca 
ctaaagtgcc 
ttgccgctac 
tcagaactgg 
agtttcttgc 
gccattcaac 



ggggtggagt 
gcggaagtgt 
gacgtttttg 
gatgttgtag 
aataagagga 
gggccgcggg 
ttccgcgttc 
acgttgtatc 
tgttgacatt 
agcccatata 
cccaacgacc 
gggactttcc 
catcaagtgt 
gcctggcatt 
gtattagtca 
tagcggtttg 
ttttggcacc 
caaatgggcg 
cgtcagatcg 
cgatccagcc 
gagatctgcc 
ttgcatcatc 
ggtttccacc 
ttaccatggt 
cactaatgtg 
accatgcacc 
ggtgcgccgg 
gaagggctct 
ggctgccgta 
ggaaactact 
gtcatttcaa 
ggctgcatat 
cttagggttt 
ggtaaggacc 
cgatggtggt 
tgactcgact 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
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acaatcttgg gcatcggcac agtcctggac caagcggaga cggctggagc gcggcttgtc 2220 

gtgctcgcca ccgctacgcc tccgggatcg gtcaccgtgc cacacccaaa catcgaggag 2280 

gtgrgccctgt ctaatactgg agagatcccc ttctatggca aagccatccc cattgaagcc 2340 

atcagggggg gaaggcatct cattttctgt cattccaaga agaagtgcga cgagctcgcc 2400 

gcaaagctgt caggcctcgg aatcaacgct gtggcgtatt accgggggct cgatgtgtcc 2460 

gtcataccaa ctatcggaga cgtcgttgtc gtggcaacag acgctctgat gacgggctat 2520 

acgggcgact ttgactcagt gatcgactgt aacacatgtg tcacccagac agtcgacttc 2580 

agcttggatc ccaccttcac cattgagacg acgaccgtgc ctcaagacgc agtgtcgcgc 2640 

tcgcagcggc ggggtaggac tggcaggggt aggagaggca tctacaggtt tgtgactccg 2700 

ggagaacggc cctcgggcat gttcgattcc tcggtcctgt gtgagtgcta tgacgcgggc 27 60 

tgtgcttggt acgagctcac ccccgccgag acctcggtta ggttgcgggc ctacctgaac 2820 

acaccagggt tgcccgtttg ccaggaccac ctggagttct gggagagtgt cttcacaggc 2880 

ctcacccaca tagatgcaca cttcttgtcc cagaccaagc aggcaggaga caacttcccc 2940 

tacctggtag cataccaagc cacggtgtgc gccagggctc aggccccacc tccatcatgg 3000 

gatcaaatgt ggaagtgtct catacggctg aaacctacgc tgcacgggcc aacacccttg 3060 

ctgtacaggc tgggagccgt ccaaaatgag gtcaccctca cccaccccat aaccaaatac 3120 

atcatggcat gcatgtcggc tgacctggag gtcgtcacta gcacctgggt gctggtgggc 3180 

ggagtccttg cagctctggc cgcgtattgc ctgacaacag gcagtgtggt cattgtgggt 3240 

aggattatct tgtccgggag gccggctatt gttcccgaca gggagtttct ctaccaggag 3300 

ttcgatgaaa tggaagagtg cgcctcgcac ctcccttaca tcgagcaggg aatgcagctc 33 60 

gccgagcaat tcaagcagaa agcgctcggg ttactgcaaa cagccaccaa acaagcggag 3420 

gctgctgctc ccgtggtgga gtccaagtgg cgagcccttg agacattctg ggcgaagcac 3480 

atgtggaatt tcatcagcgg gatacagtac ttagcaggct tatccactct gcctgggaac 3540 

cccgcaatag catcattgat ggcattcaca gcctctatca ccagcccgct caccacccaa 3600 

agtaccctcc tgtttaacat cttggggggg tgggtggctg cccaactcgc cccccccagc 3660 

gccgcttcgg ctttcgtggg cgccggcatc gccggtgcgg ctgttggcag cataggcctt 3720 

gggaaggtgc ttgtggacat tctggcgggt tatggagcag gagtggccgg cgcgctcgtg 3780 

gccttcaagg tcatgagcgg cgagatgccc tccaccgagg acctggtcaa tctacttcct 3840 

gccatcctct ctcctggcgc cctggtcgtc ggggtcgtgt gtgcagcaat actgcgtcga 39 00 

cacgtgggtc cgggagaggg ggctgtgcag tggatgaacc ggctgatagc gttcgcctcg 3960 

cggggtaatc atgtttcccc cacgcactat gtgcctgaga gcgacgccgc agcgcgtgtt 4020 

actcagatcc tctccagcct taccatcact cagctgctga aaaggctcca ccagtggatt 4080 

aatgaagact gctccacacc gtgttccggc tcgtggctaa gggatgtttg ggactggata 4140 

tgcacggtgt tgactgactt caagacctgg ctccagtcca agctcctgcc gcagctaccg 4200 

ggagtccctt ttttctcgtg ccaacgcggg tacaagggag tctggcgggg agacggcatc 4260 

atgcaaacca cctgcccatg tggagcacag atcaccggac atgtcaaaaa cggttccatg 4320 

aggatcgtcg ggcctaagac ctgcagcaac acgtggcatg gaacattccc catcaacgca 4380 

tacaccacgg gcccctgcac accctctcca gcgccaaact attctagggc gctgtggcgg 4440 

gtggccgctg aggagtacgt ggaggtcacg cgggtggggg atttccacta cgtgacgggc 4500 

atgaccactg acaacgtaaa gtgcccatgc caggttccgg ctcctgaatt cttcacggag 4560 

gtggacggag tgcggttgca caggtacgct ccggcgtgca ggcctctcct acgggaggag 4620 

gttacattcc aggtcgggct caaccaatac ctggttgggt cacagctacc atgcgagccc 4680 

gaaccggatg tagcagtgct cacttccatg ctcaccgacc cctcccacat cacagcagaa 4740 

acggctaagc gtaggttggc cagggggtct cccccctcct tggccagctc ttcagctagc 4800 

cagttgtctg cgccttcctt gaaggcgaca tgcactaccc accatgtctc tccggacgct 4860 

gacctcatcg aggccaacct cctgtggcgg caggagatgg gcgggaacat cacccgcgtg 4920 

gagtcggaga acaaggtggt agtcctggac tctttcgacc cgcttcgagc ggaggaggat 4980 

gagagggaag tatccgttcc ggcggagatc ctgcggaaat ccaagaagtt ccccgcagcg 5040 

atgcccatct gggcgcgccc ggattacaac cctccactgt tagagtcctg gaaggacccg 5100 

gactacgtcc ctccggtggt gcacgggtgc .ccgttgccac ctatcaaggc ccctccaata 5160 

ccacctccac ggagaaagag gacggttgtc ctaacagagt cctccgtgtc ttctgcctta 5220 

gcggagctcg ctactaagac cttcggcagc tccgaatcat cggccgtcga cagcggcacg 5280 

gcgaccgccc ttcctgacca ggcctccgac gacggtgaca aaggatccga cgttgagtcg 5340 

tactcctcca tgccccccct tgagggggaa ccgggggacc ccgatctcag tgacgggtct 5400 

tggtctaccg tgagcgagga agctagtgag gatgtcgtct gctgctcaat gtcctacaca 5460 
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tggacaggcg ccttgatcac gccatgcgct gcggaggaaa gcaagctgcc catcaacgcg 5520 

ttgagcaact ctttgctgcg ccaccataac atggtttatg ccacaacatc tcgcagcgca 5580 

ggcctgcggc agaagaaggt cacctttgac agactgcaag tcctggacga ccactaccgg 5640 

gacgtgctca aggagatgaa ggcgaaggcg tccacagtta aggctaaact cctatccgta 5700 

gaggaagcct gcaagctgac gcccccacat tcggccaaat ccaagtttgg ctatggggca 5760 

aaggacgtcc ggaacctatc cagcaaggcc gttaaccaca tccactccgt gtggaaggac 5820 

ttgctggaag acactgtgac accaattgac accaccatca tggcaaaaaa tgaggttttc 5880 

tgtgtccaac cagagaaagg aggccgtaag ccagcccgcc ttatcgtatt cccagatctg 5940 

ggagtccgtg tatgcgagaa gatggccctc tatgatgtgg tctccaccct tcctcaggtc 6000 

gtgatgggct cctcatacgg attccagtac tctcctgggc agcgagtcga gttcctggtg 6060 

aatacctgga aatcaaagaa aaaccccatg ggcttttcat atgacactcg ctgtttcgac 6120 

tcaacggtca ccgagaacga catccgtgtt gaggagtcaa tttaccaatg ttgtgacttg 6180 

gcccccgaag ccagacaggc cataaaatcg ctcacagagc ggctttatat cgggggtcct 6240 

ctgactaatt caaaagggca gaactgcggt tatcgccggt gccgcgcgag cggcgtgctg 6300 

acgactagct gcggtaacac cctcacatgt tacttgaagg cctctgcagc ctgtcgagct 6360 

gcgaagctcc aggactgcac gatgctcgtg aacgccgccg gccttgtcgt tatctgtgaa 6420 

agcgcgggaa cccaagagga cgcggcgagc ctacgagtct tcacggaggc tatgactagg 6480 

tactctgccc cccccgggga cccgccccaa ccagaatacg acttggagct gataacatca 6540 

tgttcctcca atgtgtcggt cgcccacgat gcatcaggca aaagggtgta ctacctcacc 6600 

cgtgatccca ccacccccct cgcacgggct gcgtgggaaa cagctagaca cactccagtt 6660 

aactcctggc taggcaacat tatcatgtat gcgcccactt tgtgggcaag gatgattctg 6720 

atgactcact tcttctccat ccttctagca caggagcaac ttgaaaaagc cctggactgc 6780 

cagatctacg gggcctgtta ctccattgag ccacttgacc tacctcagat cattgaacga 6840 

ctccatggcc ttagcgcatt ttcactccat agttactctc caggtgagat caatagggtg 6900 

gcttcatgcc tcaggaaact tggggtacca cccttgcgag tctggagaca tcgggccagg 6960 

agcgtccgcg ctaggctact gtcccagggg gggagggccg ccacttgtgg caagtacctc 7020 

ttcaactggg cagtgaagac caaactcaaa ctcactccaa tcccggctgc gtcccagctg 7080 

gacttgtccg gctggttcgt tgctggttac agcgggggag acatatatca cagcctgtct 7140 

cgtgcccgac cccgctggtt catgctgtgc ctactcctac tttctgtagg ggtaggcatc 7200 

tacctgctcc ccaaccggta aatctagagc tgtgccttct agttgccagc catctgttgt 7260 

ttgcccctcc cccgtgcctt ccttgaccct ggaaggtgcc actcccactg tcctttccta 7320 

ataaaatgag gaaattgcat cgcattgtct gagtaggtgt cattctattc tggggggtgg 7380 

ggtggggcag gacagcaagg gggaggattg ggaagacaat agcaggcatg ctggggatgc 7440 

ggtgggctct atggccgatc ggcgcgccgt actgaaatgt gtgggcgtgg cttaagggtg 7500 

ggaaagaata tataaggtgg gggtcttatg tagttttgta tctgttttgc agcagccgcc 7560 

gccgccatga gcaccaactc gtttgatgga agcattgtga gctcatattt gacaacgcgc 7620 

atgcccccat gggccggggt gcgtcagaat gtgatgggct ccagcattga tggtcgcccc 7680 

gtcctgcccg caaactctac taccttgacc tacgagaccg tgtctggaac gccgttggag 7740 

actgcagcct ccgccgccgc ttcagccgct gcagccaccg cccgcgggat tgtgactgac 7800 

tttgctttcc tgagcccgct tgcaagcagt gcagcttccc gttcatccgc ccgcgatgac 7860 

aagttgacgg ctcttttggc acaattggat tctttgaccc gggaacttaa tgtcgtttct 7920 

cagcagctgt tggatctgcg ccagcaggtt tctgccctga aggcttcctc ccctcccaat 7980 

gcggtttaaa acataaataa aaaaccagac tctgtttgga tttggatcaa gcaagtgtct 8040 

tgctgtcttt atttaggggt tttgcgcgcg cggtaggccc gggaccagcg gtctcggtcg 8100 

ttgagggtcc tgtgtatttt ttccaggacg tggtaaaggt gactctggat gttcagatac 8160 

atgggcataa gcccgtctct ggggtggagg tagcaccact gcagagcttc atgctgcggg 8220 

gtggtgttgt agatgatcca gtcgtagcag gagcgctggg cgtggtgcct aaaaatgtct 8280 

ttcagtagca agctgattgc caggggcagg cccttggtgt aagtgtttac aaagcggtta 8340 

agctgggatg ggtgcatacg tggggatatg agatgcatct tggactgtat ttttaggttg 8400 

gctatgttcc cagccatatc cctccgggga ttcatgttgt gcagaaccac cagcacagtg 8460 

tatccggtgc acttgggaaa tttgtcatgt agcttagaag gaaatgcgtg gaagaacttg 8520 

gagacgccct tgtgacctcc aagattttcc atgcattcgt ccataatgat ggcaatgggc 8580 

ccacgggcgg cggcctgggc gaagatattt ctgggatcac taacgtcata gttgtgttcc 8640 

aggatgagat cgtcataggc catttttaca aagcgcgggc ggagggtgcc agactgcggt 8700 

ataatggttc catccggccc aggggcgtag ttaccctcac agatttgcat ttcccacgct 8760 
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ttgagttcag atggggggat catgtctacc tgcggggcga tgaagaaaac ggtttccggg 8820 

gtaggggaga tcagctggga agaaagcagg ttcctgagca gctgcgactt accgcagccg 8880 

gtgggcccgt aaatcacacc tattaccggc tgcaactggt agttaagaga gctgcagctg 8940 

ccgtcatccc tgagcagggg ggccacttcg ttaagcatgt ccctgactcg catgttttcc 9000 

ctgaccaaat ccgccagaag gcgctcgccg cccagcgata gcagttcttg caaggaagca 9060 

aagtttttca acggtttgag accgtccgcc gtaggcatgc ttttgagcgt ttgaccaagc 9120 

agttccaggc ggtcccacag ctcggtcacc tgctctacgg catctcgatc cagcatatct 9180 

cctcgtttcg cgggttgggg cggctttcgc tgtacggcag tagtcggtgc tcgtccagac 9240 

gggccagggt catgtctttc cacgggcgca gggtcctcgt cagcgtagtc tgggtcacgg 9300 

tgaaggggtg cgctccgggc tgcgcgctgg ccagggtgcg cttgaggctg gtcctgctgg 9360 

tgctgaagcg ctgccggtct tcgccctgcg cgtcggccag gtagcatttg accatggtgt 9420 

catagtccag cccctccgcg gcgtggccct tggcgcgcag cttgcccttg gaggaggcgc 9480 

cgcacgaggg gcagtgcaga cttttgaggg cgtagagctt gggcgcgaga aataccgatt 9540 

ccggggagta ggcatccgcg ccgcaggccc cgcagacggt ctcgcattcc acgagccagg 9600 

tgagctctgg ccgttcgggg tcaaaaacca ggtttccccc atgctttttg atgcgtttct 9660 

tacctctggt ttccatgagc cggtgtccac gctcggtgac gaaaaggctg tccgtgtccc 9720 

cgtatacaga cttgagaggc ctgtcctcga gcggtgttcc gcggtcctcc tcgtatagaa 9780 

actcggacca ctctgagacg aaggctcgcg tccaggccag cacgaaggag gctaagtggg 9840 

aggggtagcg gtcgttgtcc actagggggt ccactcgctc cagggtgtga agacacatgt 9900 

cgccctcttc ggcatcaagg aaggtgattg gtttataggt gtaggccacg tgaccgggtg 9960 

ttcctgaagg ggggctataa aagggggtgg gggcgcgttc gtcctcactc tcttccgcat 10020 

cgctgtctgc gagggccagc tgttggggtg agtactccct ctcaaaagcg ggcatgactt 10080 

ctgcgctaag attgtcagtt tccaaaaacg aggaggattt gatattcacc tggcccgcgg 10140 

tgatgccttt gagggtggcc gcgtccatct ggtcagaaaa gacaatcttt ttgttgtcaa 10200 

gcttggtggc aaacgacccg tagagggcgt tggacagcaa cttggcgatg gagcgcaggg 10260 

tttggttttt gtcgcgatcg gcgcgctcct tggccgcgat gtttagctgc acgtattcgc 10320 

gcgcaacgca ccgccattcg ggaaagacgg tggtgcgctc gtcgggcact aggtgcacgc 10380 

gccaaccgcg gttgtgcagg gtgacaaggt caacgctggt ggctacctct ccgcgtaggc 10440 

gctcgttggt ccagcagagg cggccgccct tgcgcgagca gaatggcggt agtgggtcta 10500 

gctgcgtctc gtccgggggg tctgcgtcca cggtaaagac cccgggcagc aggcgcgcgt 10560 

cgaagtagtc tatcttgcat ccttgcaagt ctagcgcctg ctgccatgcg cgggcggcaa 10620 

gcgcgcgctc gtatgggttg agtgggggac cccatggcat ggggtgggtg agcgcggagg 10680 

cgtacatgcc gcaaatgtcg taaacgtaga ggggctctct gagtattcca agatatgtag 10740 

ggtagcatct tccaccgcgg atgctggcgc gcacgtaatc gtatagttcg tgcgagggag 10800 

cgaggaggtc gggaccgagg ttgctacggg cgggctgctc tgctcggaag actatctgcc 10860 

tgaagatggc atgtgagttg gatgatatgg ttggacgctg gaagacgttg aagctggcgt 10920 

ctgtgagacc taccgcgtca cgcacgaagg aggcgtagga gtcgcgcagc ttgttgacca 10980 

gctcggcggt gacctgcacg tctagggcgc agtagtccag ggtttccttg atgatgtcat 11040 

acttatcctg tccctttttt ttccacagct cgcggttgag gacaaactct tcgcggtctt 11100 

tccagtactc ttggatcgga aacccgtcgg cctccgaacg gtaagagcct agcatgtaga 11160 

actggttgac ggcctggtag gcgcagcatc ccttttctac gggtagcgcg tatgcctgcg 11220 

cggccttccg gagcgaggtg tgggtgagcg caaaggtgtc cctaaccatg actttgaggt 11280 

ayctggtattt gaagtcagtg tcgtcgcatc cgccctgctc ccagagcaaa aagtccgtgc 11340 

gctttttgga acgcgggttt ggcagggcga aggtgacatc gttgaagagt atctttcccg 11400 

cgcgaggcat aaagttgcgt gtgatgcgga agggtcccgg cacctcggaa cggttgttaa 11460 

ttacctgggc ggcgagcacg atctcgtcaa agccgttgat gttgtggccc acaatgtaaa 11520 

gttccaagaa gcgcgggatg cccttgatgg aaggcaattt tttaagttcc tcgtaggtga 11580 

gctcttcagg ggagctgagc ccgtgctctg aaagggccca gtctgcaaga tgagggttgg 11640 

aagcgacgaa tgagctccac aggtcacggg ccattagcat ttgcaggtgg tcgcgaaagg 11700 

tcctaaactg gcgacctatg gccatttttt ctggggtgat gcagtagaag gtaagcgggt 11760 

cttgttccca gcggtcccat ccaaggtccg cggctaggfic tcgcgcggcg gtcactagag 11820 

gctcatctcc gccgaacttc atgaccagca tgaagggcac gagctgcttc ccaaaggccc 11880 

ccatccaagt ataggtctct acatcgtagg tgacaaagag acgctcggtg cgaggatgcg 11940 

agccgatcgg gaagaactgg atctcccgcc accagttgga ggagtggctg ttgatgtggt 12000 

gaaagtagaa gtccctgcga cgggccgaac actcgtgctg gcttttgtaa aaacgtgcgc 12060 
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agtactggca 
caaggaagca 
cttcggctgc 
ccacgccgcg 
catcgcgcag 
gctcctgcag 
tgatttccag 
gcgcgactac 
ctaaaagcgg 
agggggcagg 
gctggcgaac 
gacgggcccg 
gacggcggcc 
ggccatgaac 
ggcggcgagg 
gttccagacg 
cgcgagattg 
gtagttgagg 
cgtggattcg 
ggcgaagttg 
gatgagctcg 
ttcttcaatc 
aggggggaca 

ctccccgcgg 
ttggaagacg 
ggatacggcg 
cctgagcgag 
acagtcgcaa 
tctggcggag 
cgacagaagc 
ccaggcttcg 
cggcacttct 
ggcggagttt 
catcggctga 
ctgcgtgagg 
gatggtgtaa 
gagctcggtg 
ccgcaccagg 
gcgtagggtg 
gatgtacctg 
gacgcggttc 
ggtcaggcgc 
gactcttccg 
gaaccccgga 
ggtgtgcgac 
tgctgcgcta 
gaaagcatta 
gggacccccg 
gtcatgcaag 
ttcccagatg 
agagcagcgg 
atccgcggct 
ctacttggac 
acacccaagg 
gtttcgcgac 



gcggtgcacg 
gagtgggaat 
ttgtccttga 
cgagcccaaa 
atgggagctg 
gtttacctcg 

gggctggttg 

ggtaccgcgc 

tgacgcgggc 

ggcacgtcgg 

gcgacgacgc 

gtgagcttga 

tggcgcaaaa 

tgctcgatct 

tcgttggaga 

cggctgtaga 

agctccacgt 

gtggtggcgg 

ttgatatccc 

aaaaactggg 

gcgacagtgt 

tcctcttcca 

cggcggcgac 

cgacggcgca 

ccgcccgtca 

ctaacgatgc 

tccgcatcga 

ggtaggctga 

gtgctgctga 

accatgtcct 

ttttgacatc 

tcttctcctt 

ggccgtaggt 

agcagggcca 

gtagactgga 

gtgcagttgg 

tacctgagac 

tactggtatc 

gccggggctc 

gacatccagg 

cagatgttgc 

gcgcagtcgt 

tggtctggtg 

tccggccgtc 

gtcagacaac 

gcttttttgg 

agtggctcgc 

gttcgagtct 

accccgcttg 

catccggtgc 

cagacatgca 

gacgcggcgg 

ttggaggagg 

gtgcagctga 

cgcgagggag 



ggctgtacat 

ttgagcccct 

ccgtctggct 

gtccagatgt 

tccatggtct 

catagccggg 

gtggcggcgt 

ggcgggcggt 

gggcccccgg 

cgccgcgcgc 

ggcggttgat 

acctgaaaga 

tctcctgcac 

cttcctcctg 

tgcgggccat 

ccacgccccc 

gccgggcgaa 

tgtgttctgc 

ccaaggcctc 

agttgcgcgc 

cgcgcacctc 

taagggcctc 

gacggcgcac 

tggtctcggt 

tgtcccggtt 

atctcaacaa 

ccggatcgga 

gcaccgtggc 

tgatgtaatt 

tgggtccggc 

ggcgcaggtc 

cctcttgtcc 

ggcgccctct 

ggtcggcgac 

agtcgtccat 

ccataacgga 

gcgagtaagc 

ccaccaaaaa 

cgggggcgag 

tgatgccggc 

gcagcggcaa 

tgacgctcta 

gataaattcg 

cgccgtgatc 

gggggagcgc 

ccactggccg 
tccctgtagc 
cgggccggcc 
caaattcctc 
tgcggcagat 
gggcaccctc 
cagatggtga 
gcgagggcct 
agcgtgacac 
aggagcccga 



cctgcacgag 

cgcctggcgg 

gctcgagggg 

ccgcgcgcgg 

ggagctcccg 

tcagggcgcg 

cgatggcttg 

gggccgcggg 

aggtaggggg 

gggcaggagc 

ctcctgaatc 

gagttcgaca 

gtctcctgag 

gagatctccg 

gagctgcgag 

ttcggcatcg 

gacggcgtag 

cacgaagaag 

aaggcgctcc 

cgacacggtt 

gcgctcaaag 

cccttcttct 

cgggaggcgg 

gacggcgcgg 

atgggttggc 

ttgttgtgta 

aaacctctcg 

gggcggcagc 

aaagtaggcg 

ctgctgaatg 

tttgtagtag 

tgcatctctt 

tcctcccatg 

aacgcgctcg 

gtccacaaag 

ccagttaacg 

ccttgagtca 

gtgcggcggc 

gtcttccaac 

ggcggtggtg 

aaagtgctcc 

gaccgtgcaa 

caagggtatc 

catgcggtta 

tccttttggc 

cgcgcggcgt 

cggagggtta 

ggactgcggc 

cggaaacagg 

gcgcccccct 

cccttctcct 

ttacgaaccc 

ggcgcggcta 

gcgcgaggcg 

ggagatgcgg 



gttgacctga 
gtttggctgg 
agttacggtg 
cggtcggagc 
cggcgtcagg 
ggctaggtcc 
caagaggccg 
ggtgtccttg 
ggctcgggac 
tggtgctgcg 
tggcgcctct 
gaatcaattt 
ttgtcttgat 
cgtccggctc 
aaggcgttga 
cgggcgcgca 
tttcgcaggc 
tacataaccc 
atggcctcgt 
aactcctcct 
gctacagggg 
tcttctggcg 
tcgacaaagc 
ccgttctcgc 
ggggggctgc 

ggtactccgc 
agaaaggcgt 
gggcggcggt 
gtcttgagac 
cgcaggcggt 
tcttgcatga 
gcatctatcg 
cgtgtgaccc 
gctaatatgg 
cggtggtatg 
gtctggtgac 
aagacgtagt 
ggctggcggt 
ataaggcgat 
gaggcgcgcg 
atggtcggga 
aaggagagcc 
atggcggacg 
ccgcccgcgt 
ttccttccag 
aagcggttag 
ttttccaagg 
gaacgggggt 
gacgagcccc 
cctcagcagc 
accgcgtcag 
ccgcggcgcc 
ggagcgccct 
tacgtgccgc 
gatcgaaagt 



cgaccgcgca 
tggtcttcta 
gatcggacca 
ttgatgacaa 
tcaggcggga 
aggtgatacc 
catceccgcg 
gatgatgcat 
ccgccgggag 
cgcggaggtt 
gcgtgaagac 
cggtgtcgtt 
aggcgatctc 
gctccacggt 
ggcctccctc 
tgaccacctg 
gctgaaagag 
agcgccgcaa 
agaagtccac 
ccagaagacg 
cctcttcttc 
gcggtggggg 
gctcgatcat 

gggggcgcag 
cgtgcggcag 
caccgaggga 
ctaaccagtc 
cggggttgtt 
ggcggatggt 
cggccatgcc 
gcctttctac 
ctgcggcggc 
cgaagcccct 
cctgctgcac 
cgcccgtgtt 
ccggctgcga 
cgttgcaagt 
agaggggcca 
gatatccgta 
gaaagtcacg 
cgctctggcc 
tgtaagcggg 
accggggttc 
gtcgaaccca 
gcgcggcgga 
gctggaaagc 
gttgagtcgc 
ttgcctcccc 
ttttttgctt 
ggcaagagca 
gaggggcaac 
ggacccggca 
ctcctgagcg 
ggcagaacct 
tccatgcagg 



,12120 

12180 

12240 

12300 

12360 

12420 

12480 

12540 

12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

13080 

13140 

13200 

13260 

13320 

13380 

13440 

13500 

13560 

13620 

13680 

13740 

13800 

13860 

13920 

13980 

14040 

14100 

14160 

14220 

14280 

14340 

14400 

14460 

14520 

14580 

14640 

14700 

14760 

14820 

14880 

14940 

15000 

15060 

15120 

15180 

15240 

15300 

15360 
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gcgcgagttg 
cgacgcgcgg 
cgcgtacgag 
gcgcacgctt 
aagcgcgctg 
gcagcacagc 
gggccgctgg 
cttgagcctg 
ttacgcccgc 
ggggttctac 
tcgcaacgag 
cgagctgatg 
cgagtcctac 
ggcagctggg 
cgtggaggaa 
gatgtttctg 
agccagccgt 
tcgctgactg 
gcaattctgg 
atcgtaaacg 
gacgcgctgc 
cggctggtgg 
aacctgggct 
cggggacagg 
ccgcaaagtg 
ctgcagaccg 
gctcccacag 
ctgctgctaa 
cacttgctga 
caggagatta 
accctgaact 
agcgaggagg 
gacggggtaa 
tatgcctcaa 
gtgaaccccg 
ttctacaccg 
gacgacagcg 
gcagaggcgg 
gctgcggccc 
agcactcgca 
ctgcagccgc 
ctagtggaca 
ccgcgcccgc 
gatgactcgg 
caccttcgcc 
tcaccaaggc 
ggcgatgtat 
ggcggcggcg 
gtacctgcgg 
cgacaccacc 
ccagaacgac 
ggaggcaagc 
aaccatcctg 
ggcgcgggtg 
gtgggtggag 



cggcatggcc 
accgggatta 
cagacggtga 
gtggcgcgcg 
gagcaaaacc 
agggacaacg 
ctgctcgatt 
gctgacaagg 
aagatatacc 
atgcgcatgg 
cgcatccaca 
cacagcctgc 
tttgacgcgg 
gccggacctg 
tatgacgagg 
atcagatgat 
ccggccttaa 
cgcgcaaccc 
aagcggtggt 
cgctggccga 
ttcagcgcgt 
gggatgtgcg 
ccatggttgc 
aggactacac 
aggtgtatca 
taaacctgag 
gcgaccgcgc 
tagcgccctt 
cactgtaccg 
caagtgttag 
acctgctgac 
agcgcatttt 
cgcccagcgt 
accggccgtt 
agtatttcac 
ggggattcga 
tgttttcccc 
cgctgcgaaa 
cgcggtcaga 
ccacccgccc 
agcgcgaaaa 
agatgagtag 
ccacccgtcg 
cagacgacag 
ccaggctggg 
catggcaccg 
gaggaaggtc 
ctgggttcac 
cctaccgggg 
cgtgtgtacc 
cacagcaact 
acacagacca 
cataccaaca 
atggtgtcgc 
ttcacgctgc 



tgaaccgcga 
gtcccgcgcg 
accaggagat 
aggaggtggc 
caaatagcaa 
aggcattcag 
tgataaacat 
tggccgccat 
atacccctta 
cgctgaaggt 
aggccgtgag 
aaagggccct 
gcgctgacct 
ggctggcggt 
acgatgagta 
gcaagacgca 
ctccacggac 
tgacgcgttc 
cccggcgcgc 
aaacagggcc 
ggctcgttac 
cgaggccgtg 
actaaacgcc 
caactttgtg 
gtccgggcca 
ccaggctttc 
gaccgtgtct 
cacggacagt 
cgaggccata 
ccgcgcgctg 
caaccggcgg 
gcgctatgtg 
ggcgctggac 
tatcaatcgc 
caatgccatc 
ggtgcccgag 
gcaaccgcag 
ggaaagcttc 
tgctagtagc 
gcgcctgctg 
gaacctgcct 
atggaagacg 
tcaaaggcac 
cagcgtcttg 
gagaatgttt 
agcgttggtt 
ctcctccctc 
ccttcgatgc 
ggagaaacag 
ttgtggacaa 
ttctaaccac 
tcaatcttga 
tgccaaatgt 
gctcgcttac 
ccgagggcaa 



gcggttgctg 
cgcacacgtg 
taactttcaa 
tataggactg 
gccgctcatg 
ggatgcgctg 
tctgcagagc 
taactattcc 
cgttcccata 
gcttaccttg 
cgtgagccgg 
ggctggcacg 
gcgctgggcc 
ggcacccgcg 
cgagccagag 
acggacccgg 
gactggcgcc 
cggcagcagc 
gcaaacccca 
atccggcccg 
aacagcagca 
gcgcagcgtg 
ttcctgagta 
agcgcactgc 
gactattttt 
aagaacttgc 
agcttgctga 
ggcagcgtgt 
ggtcaggcgc 
gggcaggagg 
caaaaaatcc 
cagcagagcg 
atgaccgcgc 
ctaatggact 
ttgaacccgc 
ggtaacgatg 
accctgctag 
cgcaggccaa 
ccatttccaa 
ggcgaggagg 
ccggcgtttc 
tatgcgcagg 
gaccgtcagc 
gatttgggag 
taaaaaaaag 
ttcttgtatt 
ctacgagagc 
tcccctggac 
catccgttac 
caagtcaacg 
ggtcattcaa 
cgaccggtcg 
gaacgagttc 
taaggacaaa 
ctactccgag 



cgcgaggagg 
gcggccgccg 
aaaagcttta 
atgcatctgt 
gcgcagctgt 
ctaaacatag 
atagtggtgc 
atgctcagtc 
gacaaggagg 
agcgacgacc 
cggcgcgagc 
ggcagcggcg 
ccaagccgac 
cgcgctggca 
gacggcgagt 
cggtgcgggc 
aggtcatgga 
cgcaggccaa 
cgcacgagaa 
atgaggccgg 
acgtgcagac 
agcgcgcgca 
cacagcccgc 
ggctaatggt 
tccagaccag 
aggggctgtg 
cgcccaactc 
cccgggacac 
atgtggacga 
acacgggcag 
cctcgttgca 
tgagccttaa 
gcaacatgga 
acttgcatcg 
actggctacc 
gattcctctg 
agttgcaaca 
gcagcttgtc 
gcttgatagg 
agtacctaaa 
ccaacaacgg 
agcacaggga 

ggggtctggt 
ggagtggcaa 
catgatgcaa 
ccccttagta 
gtggtgagcg 
ccgccgttcg 
tctgagttgg 
gatgtggcat 
aacaatgact 
cactggggcg 
atgtttacca 
caggtggagc 
accatgacca 



actttgagcc 
acctggtaac 
acaaccacgt 
gggactttgt 
tccttatagt 
tagagcccga 
aggagcgcag 
tgggcaagtt 
taaagatcga 
tgggcgttta 
tcagcgaccg 
atagagaggc 
gcgccctgga 
acgtcggcgg 
actaagcggt 
ggcgctgcag 
ccgcatcatg 
ccggctctcc 
ggtgctggcg 
cctggtctac 
caacctggac 
gcagcagggc 
caacgtgccg 
gactgagaca 
tagacaaggc 

gggggtgcgg 
gcgcctgttg 
atacctaggt 
gcatactttc 
cctggaggca 
cagtttaaac 
cctgatgcgc 
accgggcatg 
cgcggccgcc 
gccccctggt 
ggacgacata 
acgcgagcag 
cgatctaggc 
gtctcttacc 
caactcgctg 
gatagagagc 
tgtgcccggc 
gtgggaggac 
cccgtttgca 
aataaaaaac 
tgcggcgcgc 
cggcgccagt 
tgcctccgcg 
cacccctatt 
ccctgaacta 
acagcccggg 
gcgacctgaa 
ataagtttaa 
tgaaatacga 
tagaccttat 



15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
16140 
16200 
16260 
16320 
16380 
16440 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
17280 
17340 
17400 
17460 
17520 
17580 
17640 
17700 
17760 
17820 
17880 
17940 
18000 
18060 
18120 
18180 
18240 
18300 
18360 
18420 
18480 
18540 
18600 
18660 
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gaacaacgcg atcgtggagc actacttgaa agtgggcagg cagaacgggg ttctggaaag 18720 

cgacatcggg gtaaagtttg acacccgcaa cttcagactg gggtttgacc cagtcactgg 18780 

tcttgtcatg cctggggtat atacaaacga agccttccat ccagacatca ttttgctgcc 18840 

aggatgcggg gtggacttca cccacagccg cctgagcaac ttgttgggca tccgcaagcg 18900 

gcaacccttc caggagggct ttaggatcac ctacgatgac ctggagggtg gtaacattcc 18960 

cgcactgttg gatgtggacg cctaccaggc aagcttgaaa gatgacaccg aacagggcgg 19020 

gggtggcgca ggcggcggca acaacagtgg cagcggcgcg gaagagaact ccaacgcggc 19080 

agctgcggca atgcagccgg tggaggacat gaacgatcat gccattcgcg gcgacacctt 19140 

tgccacacgg gcggaggaga agcgcgctga ggccgaggca gcggccgaag ctgccgcccc 19200 

cgctgcggag gctgcacaac ccgaggtcga gaagcctcag aagaaaccgg tgattaaacc 19260 

cctgacagag gacagcaaga aacgcagtta caacctaata agcaatgaca gcaccttcac 19320 

ccagtaccgc agctggtacc ttgcatacaa ctacggcgac cctcaggccg ggatccgctc 19380 

atggaccctg ctttgcactc ctgacgtaac ctgcggctcg gagcaggtat actggtcgtt 19440 

gcccgacatg atgcaagacc ccgtgacctt ccgctccacg cgccagatca gcaactttcc 19500 

ggtggtgggc gccgagctgt tgcccgtgca ctccaagagc ttctacaacg accaggccgt 19560 

ctactcccag ctcatccgcc agtttacctc tctgacccac gtgttcaatc gctttcccga 19620 

gaaccagatt ttggcgcgcc cgccagcccc caccatcacc accgtcagtg aaaacgttcc 19680 

tgctctcaca gatcacggga cgctaccgct gcgcaacagc atcggaggag tccagcgagt 19740 

gaccattact gacgccagac gccgcacctg cccctacgtt tacaaggccc tgggcatagt 19800 

ctcgccgcgc gtcctatcga gccgcacttt ttgagcaagc atgtccatcc ttatatcgcc 19860 

cagcaataac acaggctggg gcctgcgctt cccaagcaag atgtttggcg gggccaagaa 19920 

gcgctccgac caacacccag tgcgcgtgcg cgggcactac cgcgcgccct ggggcgcgca 19980 

caaacgcggc cgcactgggc gcaccaccgt cgatgacgcc atcgacgcgg tggtggagga 20040 

ggcgcgcaac tacacgccca cgccgccgcc agtgtccacc gtggacgcgg ccattcagac 20100 

cgtggtgcgc ggagcccggc gctacgctaa aatgaagaga cggcggaggc gcgtagcacg 20160 

tcgccaccgc cgccgacccg gcactgccgc ccaacgcgcg gcggcggccc tgcttaaccg 20220 

cgcacgtcgc accggccgac gggcggccat gcgagccgct cgaaggctgg ccgcgggtat 20280 

tgtcactgtg ccccccaggt ccaggcgacg agcggccgcc gcagcagccg cggccattag 20340 

tgctatgact cagggtcgca ggggcaacgt gtactgggtg cgcgactcgg ttagcggcct 20400 

gcgcgtgccc gtgcgcaccc gccccccgcg caactagatt gcaataaaaa actacttaga 20460 

ctcgtactgt tgtatgtatc cagcggcggc ggcgcgcatc gaagctatgt ccaagcgcaa 20520 

aatcaaagaa gagatgctcc aggtcatcgc gccggagatc tatggccccc cgaagaagga 20580 

agagcaggat tacaagcccc gaaagctaaa gcgggtcaaa aagaaaaaga aagatgatga 20640 

tgatgatgaa cttgacgacg aggtggaact gttgcacgcg accgcgccca ggcgacgggt 20700 

acagtggaaa ggtcgacgcg taagacgtgt tttgcgaccc ggcaccaccg tagtctttac 20760 

gcccggtgag cgctccaccc gcacctacaa gcgcgtgtat gatgaggtgt acggcgacga 20820 

ggacctgctt gagcaggcca acgagcgcct cggggagttt gcctacggaa agcggcataa 20880 

ggacatgctg gcgttgccgc tggacgaggg caacccaaca cctagcctaa agcccgtgac 20940 

actgcagcag gtgctgcccg cgcttgcacc gtccgaagaa aagcgcggcc taaagcgcga 21000 

gtctggtgac ttggcaccca ccgtgcagct gatggtaccc aagcgtcagc gactggaaga 21060 

tgtcttggaa aaaatgaccg tggagcctgg gctggagccc gaggtccgcg tgcggccaat 21120 

caagcaggtg gcaccgggac tgggcgtgca gaccgtggac gttcagatac ccaccaccag 21180 

fcagcactagt attgccactg ccacagaggg catggagaca caaacgtccc cggttgcctc 21240 

ggcggtggca gatgccgcgg tgcaggcggc cgctgcggcc gcgtccaaga cctctacgga 213 00 

ggtgcaaacg gacccgtgga tgtttcgtgt ttcagccccc cggcgtccgc gccgttcaag 21360 

gaagtacggc gccgccagcg cgctactgcc cgaatatgcc ctacatcctt ccatcgcgcc 21420 

tacccccggc tatcgtggct acacctaccg ccccagaaga cgagcaacta cccgacgccg 21480 

aaccaccact ggaacccgcc gccgccgtcg ccgtcgccag cccgtgctgg ccccgatttc 21540 

cgtgcgcagg gtggctcgcg aaggaggcag gaccctggtg ctgccaacag cgcgctacca 21600 

ccccagcatc gtttaaaagc cggtctttgt ggttzcttgca gatatggccc tcacctgccg 21660 

cctccgtttc ccggtgccgg gattccgagg aagaatgcac cgtaggaggg gcatggccgg 21720 

ccacggcctg acgggcggca tgcgtcgtgc gcaccaccgg cggcggcgcg cgtcgcaccg 21780 

tcgcatgcgc ggcggtatcc tgcccctcct tattccactg atcgccgcgg cgattggcgc 21840 

cgtgcccgga attgcatccg tggccttgca ggcgcagaga cactgattaa aaacaagtta 21900 

catgtggaaa aatcaaaata aaagtctgga ctctcacgct cgcttggtcc tgtaactatt 21960 
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ttgtagaatg gaagacatca actttgcgtc actggccccg cgacacggct cgcgcccgtt 22020 

catgggaaac tggcaagata tcggcaccag caatatgagc ggtggcgcct tcagctgggg 22080 

ctcgctgtgg agcggcatta aaaatttcgg ttccgccgtt aagaactatg gcagcaaagc 22140 

ctggaacagc agcacaggcc agatgctgag ggacaagttg aaagagcaaa atttccaaca 22200 

aaaggtggta gatggcctgg cctctggcat tagcggggtg gtggacctgg ccaaccaggc 22260 

agtgcaaaat aagattaaca gtaagcttga tccccgccct cccgtagagg agcctccacc 22320 

ggccgtggag acagtgtctc cagaggggcg tggcgaaaag cgtccgcgac ccgacaggga 22380 

agaaactctg gtgacgcaaa tagacgagcc tccctcgtac gaggaggcac taaagcaagg 22440 

cctgcccacc acccgtccca tcgcgcccat ggctaccgga gtgctgggcc agcacacacc 22500 

cgtaacgctg gacctgcctc cccccgccga cacccagcag aaacctgtgc tgccaggccc 22560 

gtccgccgtt gttgtaaccc gtcctagccg cgcgtccctg cgccgcgccg ccagcggtcc 22620 

gcgatcgttg cggcccgtag ccagtggcaa ctggcaaagc acactgaaca gcatcgtggg 22680 

tttgggggtg caatccctga agcgccgacg atgcttctga tagctaacgt gtcgtatgtg 22740 

tgtcatgtat gcgtccatgt cgccgccaga ggagctgctg agccgccgcg cgcccgcttt 22800 

ccaagatggc taccccttcg atgatgccgc agtggtctta catgcacatc tcgggccagg 22860 

acgcctcgga gtacctgagc cccgggctgg tgcagttcgc ccgcgccacc gagacgtact 22920 

tcagcctgaa taacaagttt agaaacccca cggtggcgcc tacgcacgac gtgaccacag 22980 

accggtctca gcgtttgacg ctgcggttca tccccgtgga ccgcgaggat actgcgtact 23040 

cgtacaaggc gcggttcacc ctagctgtgg gtgataaccg tgtgctagac atggcttcca 23100 

; cgtactttga catccgcggc gtgctggaca ggggccctac ttttaagccc tactctggca 23160 

ctgcctacaa cgcactggcc cccaagggtg cccccaactc gtgcgagtgg gaacaaaatg 23220 

aaactgcaca agtggatgct caagaacttg acgaagagga gaatgaagcc aatgaagctc 23280 

aggcgcgaga acaggaacaa gctaagaaaa cccatgtata tgcccaggct ccactgtccg 23340 

gaataaaaat aactaaagaa ggtctacaaa taggaactgc cgacgccaca gtagcaggtg 23400 

ccggcaaaga aattttcgca gacaaaactt ttcaacctga accacaagta ggagaatctc 23460 

aatggaacga agcggatgcc acagcagctg gtggaagggt tcttaaaaag acaactccca 23520 

tgaaaccctg ctatggctca tacgctagac ccaccaattc caacggcgga cagggcgtta 23580 

tggttgaaca aaatggtaaa ttggaaagtc aagtcgaaat gcaatttttt tccacatcca 23 640 

caaatgccac aaatgaagtt aacaatatac aaccaacagt tgtattgtac agcgaagatg 23700 

taaacatgga aactccagat actcatcttt cttataaacc taaaatgggg gataaaaatg 23760 

ccaaagtcat gcttggacaa caagcaatgc caaacagacc aaattacatt gcttttagag 23820 

acaattttat tggtctcatg tattacaaca gcacaggtaa catgggtgtc cttgctggtc 23880 

aggcatcgca gttgaacgct gttgtagatt tgcaagacag aaacacagag ctgtcctacc 23940 

agcttttgct tgattcaatt ggcgacagaa caagatactt ttcaatgtgg aatcaagctg 24000 

ttgacagcta tgatccagat gtcagaatta ttgagaacca tggaactgag gatgagttgc 24060 

caaattattg ctttcctctt ggtggaattg ggattactga cacttttcaa gctgttaaaa 24120 

caactgctgc taacggggac caaggcaata ctacctggca aaaagattca acatttgcag 24180 

aacgcaatga aataggggtg ggaaataact ttgccatgga aattaacctg aatgccaacc 24240 

tatggagaaa tttcctttac tccaatattg cgctgtacct gccagacaag ctaaaataca 24300 

accccaccaa tgtggaaata tctgacaacc ccaacaccta cgactacatg aacaagcgag 24360 

tggtggctcc tgggcttgta gactgctaca ttaaccttgg ggcgcgctgg tctctggact 24420 

acatggacaa cgttaatccc tttaaccacc accgcaatgc gggcctgcgt taccgctcca 24480 

tgttgttggg aaacggccgc tacgtgccct ttcacattca ggtgccccaa aagttttttg 24540 

ccattaaaaa cctcctcctc ctgccaggct catacacata tgaatggaac ttcaggaagg 24600 

atgttaacat ggttctgcag agctctctgg gaaacgacct tagagttgac ggggctagca 24660 

ttaagtttga cagcatttgt ctttacgcca ccttcttccc catggcccac aacacggcct 24720 

ccacgctgga agccatgctc agaaatgaca ccaacgacca gtcctttaat gactaccttt 24780 

ccgccgccaa catgctatat cccatacccg ccaacgccac caacgtgccc atctccatcc 24840 

catcgcgcaa ctgggcagca tttcgcggtt gggccttcac acgcttgaag acaaaggaaa 24900 

ccccttccct gggatcaggc tacgaccctt actacaccta ctctggctcc ataccatacc 24960 

ttgacggaac cttctatctt aatcacacct ttaagaaggt ggccattact tttgactctt 25020 

ctgttagctg gccgggcaac gaccgcctgc ttactcccaa tgagtttgag attaagcgct 25080 

cagttgacgg ggagggctat aacgtagctc agtgcaacat gacaaaggac tggttcctag 25140 

tgcagatgtt ggccaactac aatattggct accagggctt ctacattcca gaaagctaca 25200 

aagaccgcat gtactcgttc ttcagaaact tccagcccat gagccggcaa gtggtggacg 25260 
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atactaaata caaagattat cagcaggttg gaattatcca ccagcataac aactcaggct 25320 

tcgtaggcta cctcgctccc accatgcgcg agggacaagc ttaccccgct aatgttccct 25380 

acccactaat aggcaaaacc gcggttgata gtattaccca gaaaaagttt ctttgcgacc 25440 

gcaccctgtg gcgcatcccc ttctccagta actttatgtc catgggtgcg ctcacagacc 25500 

tgggccaaaa ccttctctac gcaaactccg cccacgcgct agacatgacc tttgaggtgg 25560 

atcccatgga cgagcccacc cttctttatg ttttgtttga agtctttgac gtggtccgtg 25620 

tgcaccagcc gcaccgcggc gtcatcgaga ccgtgtacct gcgcacgccc ttctcggccg 25680 

gcaacgccac aacataaaga agcaagcaac atcaacaaca gctgccgcca tgggctccag 25740 

tgagcaggaa ctgaaagcca ttgtcaaaga tcttggttgt gggccatatt ttttgggcac 25800 

ctatgacaag cgcttcccag gctttgtttc cccacacaag ctcgcctgcg ccatagttaa 25860 

cacggccggt cgcgagactg ggggcgtaca ctggatggcc tttgcctgga acccgcgctc 25920 

aaaaacatgc tacctctttg agccctttgg cttttctgac caacgtctca agcaggttta 25980 

ccagtttgag tacgagtcac tcctgcgccg tagcgccatt gcctcttccc ccgaccgctg 26040 

tataacgctg gaaaagtcca cccaaagcgt gcaggggccc aactcggccg cctgtggcct 26100 

attctgctgc atgtttctcc acgcctttgc caactggccc caaactccca tggatcacaa 26160 

ccccaccatg aaccttatta ccggggtacc caactccatg cttaacagtc cccaggtaca 26220 

gcccaccctg cgccgcaacc aggaacagct ctacagcttc ctggagcgcc actcgcccta 26280 

cttccgcagc cacagtgcgc aaattaggag cgccacttct ttttgtcact tgaaaaacat 26340 

gtaaaaataa tgtactagga gacactttca ataaaggcaa atgtttttat ttgtacactc 26400 

tcgggtgatt atttaccccc acccttgccg tctgcgccgt ttaaaaatca aaggggttct 26460 

gccgcgcatc gctatgcgcc actggcaggg acacgttgcg atactggtgt ttagtgctcc 26520 

acttaaactc aggcacaacc atccgcggca gctcggtgaa gttttcactc cacaggctgc 26580 

gcaccatcac caacgcgttt agcaggtcgg gcgccgatat cttgaagtcg cagttggggc 26640 

ctccgccctg cgcgcgcgag ttgcgataca cagggttaca gcactggaac actatcagcg 26700 

ccgggtggtg cacgctggcc agcacgctct tgtcggagat cagatccgcg tccaggtcct 26760 

ccgcgttgct cagggcgaac ggagtcaact ttggtagctg ccttcccaaa aagggtgcat 26820 

gcccaggctt tgagttgcac tcgcaccgta gtggcatcag aaggtgaccg tgcccagtct 26880 

gggcgttagg atacagcgcc tgcatgaaag ccttgatctg cttaaaagcc acctgagcct 26940 

ttgcgccttc agagaagaac atgccgcaag acttgccgga aaactgattg gccggacagg 27000 

ccgcgtcatg cacgcagcac cttgcgtcgg tgttggagat ctgcaccaca tttcggcccc 27060 

accggttctt cacgatcttg gccttgctag actgctcctt cagcgcgcgc tgcccgtttt 27120 

cgctcgtcac atccatttca atcacgtgct ccttatttat cataatgctc ccgtgtagac 27180 

acttaagctc gccttcgatc tcagcgcagc ggtgcagcca caacgcgcag cccgtgggct 27240 

cgtggtgctt gtaggttacc tctgcaaacg actgcaggta cgcctgcagg aatcgcccca 27300 

tcatcgtcac aaaggtcttg ttgctggtga aggtcagctg caacccgcgg tgctcctcgt 27360 

ttagccaggt cttgcatacg gccgccagag cttccacttg gtcaggcagt agcttgaagt 27420 

ttgcctttag atcgttatcc acgtggtact tgtccatcaa cgcgcgcgca gcctccatgc 27480 

ccttctccca cgcagacacg atcggcaggc tcagcgggtt tatcaccgtg ctttcacttt 27540 

ccgcttcact ggactcttcc ttttcctctt gcatccgcat accccgcgcc actgggtcgt 27 600 

cttcattcag ccgccgcacc gtgcgcttac ctcccttgcc gtgcttgatt agcaccggtg 27660 

ggttgctgaa acccaccatt tgtagcgcca catcttctct ttcttcctcg ctgtccacga 27720 

tcacctctgg ggatggcggg cgctcgggct tgggagaggg gcgcttcttt ttctttttgg 27780 

acgcaatggc caaatccgcc gtcgaggtcg atggccgcgg gctgggtgtg cgcggcacca 27840 

gcgcatcttg tgacgagtct tcttcgtcct cggactcgag acgccgcctc agccgctttt 27900 

ttgggggcgc gcggggaggc ggcggcgacg gcgacgggga cgagacgtcc tccatggttg 27960 

gtggacgtcg cgccgcaccg cgtccgcgct cgggggtggt ttcgcgctgc tcctcttccc 28020 

gactggccat ttccttctcc tataggcaga aaaagatcat ggagtcagtc gagaaggagg 28080 

acagcctaac cgcccccttt gagttcgcca ccaccgcctc caccgatgcc gccaacgcgc 28140 

ctaccacctt ccccgtcgag gcacccccgc ttgaggagga ggaagtgatt atcgagcagg 28200 

acccaggttt tgtaagcgaa gacgacgaag atcgctcagt accaacagag gataaaaagc 28260 

aagaccagga cgacgcagag gcaaacgagg aacaagtcgg gcggggggac caaaggcatg 28320 

gcgactacct agatgtggga gacgacgtgc tgttgaagca tctgcagcgc cagtgcgcca 28380 

ttatctgcga cgcgttgcaa gagcgcagcg atgtgcccct cgccatagcg gatgtcagcc 28440 

ttgcctacga acgccacctg ttctcaccgc gcgtaccccc caaacgccaa gaaaacggca 28500 

catgcgagcc caacccgcgc ctcaacttct accccgtatt tgccgtgcca gaggtgcttg 28560 
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ccacctatca catctttttc caaaactgca agatacccct atcctgccgt gccaaccgca 28620 

gccgagcgga caagcagctg gccttgcggc agggcgctgt catacctgat atcgcctcgc 28680 

tcgacgaagt gccaaaaatc tttgagggtc ttggacgcga cgagaagcgc gcggcaaacg 28740 

ctctgcaaca agaaaacagc gaaaatgaaa gtcactgtgg agtgctggtg gaacttgagg 28800 

gtgacaacgc gcgcctagcc gtgctgaaac gcagcatcga ggtcacccac tttgcctacc 28860 

cggcacttaa cctacccccc aaggttatga gcacagtcat gagcgagctg atcgtgcgcc 28920 

gtgcacgacc cctggagagg gatgcaaact tgcaagaaca aaccgaggag ggcctacccg 28980 

cagttggcga tgagcagctg gcgcgctggc ttgagacgcg cgagcctgcc gacttggagg 29040 

agcgacgcaa gctaatgatg gccgcagtgc ttgttaccgt ggagcttgag tgcatgcagc 29100 

ggttctttgc tgacccggag atgcagcgca agctagagga aacgttgcac tacacctttc 29160 

gccagggcta cgtgcgccag gcctgcaaaa tttccaacgt ggagctctgc aacctggtct 29220 

cctaccttgg aattttgcac gaaaaccgcc ttgggcaaaa cgtgcttcat tccacgctca 29280 

agggcgaggc gcgccgcgac tacgtccgcg actgcgttta cttatttctg tgctacacct 29340 

ggcaaacggc catgggcgtg tggcagcagt gcctggagga gcgcaacctg aaggagctgc 29400 

agaagctgct aaagcaaaac ttgaaggacc tatggacggc cttcaacgag cgctccgtgg 29460 

ccgcgcacct ggcggacatt atcttccccg aacgcctgct taaaaccctg caacagggtc 29520 

tgccagactt caccagtcaa agcatgttgc aaaactttag gaactttatc ctagagcgtt 29580 

caggaattct gcccgccacc tgctgtgcgc ttcctagcga ctttgtgccc attaagtacc 29640 

gtgaatgccc tccgccgctt tggggtcact gctaccttct gcagctagcc aactaccttg 29700 

cctaccactc cgacatcatg gaagacgtga gcggtgacgg cctactggag tgtcactgtc 29760 

gctgcaacct atgcaccccg caccgctccc tggtctgcaa ttcacaactg cttagcgaaa 29820 

gtcaaattat cggtaccttt gagctgcagg gtccctcgcc tgacgaaaag tccgcggctc 29880 

cggggttgaa actcactccg gggctgtgga cgtcggctta ccttcgcaaa tttgtacctg 29940 

aggactacca cgcccacgag attaggttct acgaagacca atcccgcccg ccaaatgcgg 30000 

agcttaccgc ctgcgtcatt acccagggcc acatccttgg ccaattgcaa gccattaaca 30060 

aagcccgcca agagtttctg ctacgaaagg gacggggggt ttacttggac ccccagtccg 3 0120 

gcgaggagct caacccaatc cccccgccgc cgcagcccta tcagcagccg cgggcccttg 30180 

cttcccagga tggcacccaa aaagaagctg cagctgccgc cgccgccacc cacggacgag 3 0240 

gaggaatact gggacagtca ggcagaggag gttttggacg aggaggagga gatgatggaa 303 00 

gactgggaca gcctagacga ggaagcttcc gaggccgaag aggtgtcaga cgaaacaccg 303 60 

tcaccctcgg tcgcattccc ctcgccggcg ccccagaaat cggcaaccgt tcccagcatt 30420 

gctacaacct ccgctcctca ggcgccgccg gcactgcccg ttcgccgacc caaccgtaga 30480 

tgggacacca ctggaaccag ggccggtaag tctaagcagc cgccgccgtt agcccaagag 30540 

caacaacagc gccaaggcta ccgctcgtgg cgcgtgcaca agaacgccat agttgcttgc 30600 

ttgcaagact gtgggggcaa catctccttc gcccgccgct ttcttctcta ccatcacggc 30660 

gtggccttcc cccgtaacat cctgcattac taccgtcatc tctacagccc ctactgcacc 30720 

ggcggcagcg gcagcaacag cagcggccac gcagaagcaa aggcgaccgg atagcaagac 30780 

tctgacaaag cccaagaaat ccacagcggc ggcagcagca ggaggaggag cactgcgtct 30840 

ggcgcccaac gaacccgtat cgacccgcga gcttagaaac aggatttttc ccactctgta 30900 

tgctatattt caacagagca ggggccaaga acaagagctg aaaataaaaa acaggtctct 30960 

gcgctccctc acccgcagct gcctgtatca caaaagcgaa gatcagcttc ggcgcacgct 31020 

ggaagacgcg gaggctctct tcagcaaata ctgcgcgctg actcttaagg actagtttcg 31080 

cgccctttct caaatttaag cgcgaaaact acgtcatctc cagcggccac acccggcgcc 31140 

agcacctgtc gtcagcgcca ttatgagcaa ggaaattccc acgccctaca tgtggagtta 31200 

ccagccacaa atgggacttg cggctggagc tgcccaagac tactcaaccc gaataaacta 31260 

catgagcgcg ggaccccaca tgatatcccg ggtcaacgga atccgcgccc accgaaaccg 31320 

aattctcctc gaacaggcgg ctattaccac cacacctcgt aataacctta atccccgtag 31380 

ttggcccgct gccctggtgt accaggaaag tcccgctccc accactgtgg tacttcccag 31440 

agacgcccag gccgaagttc agatgactaa ctcaggggcg cagcttgcgg gcggctttcg 31500 

tcacagggtg cggtcgcccg ggcagggtat aactcacctg aaaatcagag ggcgaggtat 31560 

tcagctcaac gacgagtcgg tgagctcctc tcttggtctc cgtccggacg ggacatttca 31620 

gatcggcggc gctggccgct cttcatttac gccccgtcag gcgatcctaa ctctgcagac 31680 

ctcgtcctcg gagccgcgct ccggaggcat tggaactcta caatttattg aggagttcgt 31740 

gccttcggtt tacttcaacc ccttttctgg acctcccggc cactacccgg accagtttat 31800 

tcccaacttt gacgcggtaa aagactcggc ggacggctac gactgaatga ccagtggaga 31860 
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ggcagagcaa ctgcgcctga cacacctcga ccactgccgc cgccacaagt gctttgcccg 31920 

cggctccggt gagttttgtt actttgaatt gcccgaagag catatcgagg gcccggcgca 31980 

cggcgtccgg ctcaccaccc aggtagagct tacacgtagc ctgattcggg agtttaccaa 32040 

gcgccccctg ctagtggagc gggagcgggg tccctgtgtt ctgaccgtgg tttgcaactg 32100 

tcctaaccct ggattacatc aagatcttat tccattcaac taacaataaa cacacaataa 32160 

attacttact taaaatcagt cagcaaatct ttgtccagct tattcagcat cacctccttt 32220 

ccctcctccc aactctggta tttcagcagc cttttagctg cgaactttct ccaaagtcta 32280 

aatgggatgt caaattcctc atgttcttgt ccctccgcac ccactatctt catattgttg 32340 

cagatgaaac gcgccagacc gtctgaagac accttcaacc ctgtgtaccc atatgacacg 32400 

gaaaccggcc ctccaactgt gcctttcctt acccctccct ttgtgtcgcc aaatgggttc 32460 

caagaaagtc cccccggagt gctttctttg cgtctttcag aacctttggt tacctcacac 32520 

ggcatgcttg cgctaaaaat gggcagcggc ctgtccctgg atcaggcagg caaccttaca 32580 

tcaaatacaa tcactgtttc tcaaccgcta aaaaaaacaa agtccaatat aactttggaa 32640 

acatccgcgc cccttacagt cagctcaggc gccctaacca tggccacaac ttcgcctttg 32700 

gtggtctctg acaacactct taccatgcaa tcacaagcac cgctaaccgt gcaagactca 32760 

aaacttagca ttgctaccaa agagccactt acagtgttag atggaaaact ggccctgcag 32820 

acatcagccc ccctctctgc cactgataac aacgccctca ctatcactgc ctcacctcct 32880 

cttactactg caaatggtag tctggctgtt accatggaaa acccacttta caacaacaat 32940 

ggaaaacttg ggctcaaaat tggcggtcct ttgcaagtgg ccaccgactc acatgcacta 33000 

acactaggta ctggtcaggg ggttgcagtt cataacaatt tgctacatac aaaagttaca 33060 

ggcgcaatag ggtttgatac atctggcaac atggaactta aaactggaga tggcctctat 3312 0 

gtggatagcg ccggtcctaa ccaaaaacta catattaatc taaataccac aaaaggcctt 33180 

gcttttgaca acaccgcaat aacaattaac gctggaaaag ggttggaatt tgaaacagac 33240 

tcctcaaacg gaaatcccat aaaaacaaaa attggatcag gcatacaata taataccaat 333 00 

ggagctatgg ttgcaaaact tggaacaggc ctcagttttg acagctccgg agccataaca 333 60 

atgggcagca taaacaatga cagacttact ctttggacaa caccagaccc atccccaaat 33420 

tgcagaattg cttcagataa agactgcaag ctaactctgg cgctaacaaa atgtggcagt 33480 

caaattttgg gcactgtttc agctttggca gtatcaggta atatggcctc catcaatgga 33540 

actctaagca gtgtaaactt ggttcttaga tttgatgaca acggagtgct tatgtcaaat 33600 

tcatcactgg acaaacagta ttggaacttt agaaacgggg actccactaa cggtcaacca 33660 

tacacttatg ctgttgggtt tatgccaaac ctaaaagctt acccaaaaac tcaaagtaaa 33720 

actgcaaaaa gtaatattgt tagccaggtg tatcttaatg gtgacaagtc taaaccattg 33780 

cattttacta ttacgctaaa tggaacagat gaaaccaacc aagtaagcaa atactcaata 33840 

tcattcagtt ggtcctggaa cagtggacaa tacactaatg acaaatttgc caccaattcc 33900 

tataccttct cctacattgc ccaggaataa agaatcgtga acctgttgca tgttatgttt 33960 

caacgtgttt atttttcaat tgcagaaaat ttcaagtcat ttttcattca gtagtatagc 34020 

cccaccacca catagcttat actaatcacc gtaccttaat caaactcaca gaaccctagt 34080 

attcaacctg ccacctccct cccaacacac agagtacaca gtcctttctc cccggctggc 34140 

cttaaacagc atcatatcat gggtaacaga catattctta ggtgttatat tccacacggt 34200 

ctcctgtcga gccaaacgct catcagtgat gttaataaac tccccgggca gctcgcttaa 34260 

gttcatgtcg ctgtccagct gctgagccac aggctgctgt ccaacttgcg gttgctcaac 34320 

gggcggcgaa ggagaagtcc acgcctacat gggggtagag tcataatcgt gcatcaggat 34380 

agggcggtgg tgctgcagca gcgcgcgaat aaactgctgc cgccgccgct ccgtcctgca 34440 

ggaatacaac atggcagtgg tctcctcagc gatgattcgc accgcccgca gcataaggcg 34500 

ccttgtcctc cgggcacagc agcgcaccct gatctcactt aagtcagcac agtaactgca 34560 

gcacagtacc acaatattgt ttaaaatccc acagtgcaag gcgctgtatc caaagctcat 3462 0 

ggcggggacc acagaaccca cgtggccatc ataccacaag cgcaggtaga ttaagtggcg 34680 

acccctcata aacacgctgg acataaacat tacctctttt ggcatgttgt aattcaccac 34740 

ctcccggtac catataaacc tctgattaaa catggcgcca tccaccacca tcctaaacca 34800 

gctggccaaa acctgcccgc cggctatgca ctgcagggaa ccgggactgg aacaatgaca 34860 

gtggagagcc caggactcgt aaccatggat catcatgctc gtcatgatat caatgttggc 34920 

acaacacagg cacacgtgca tacacttcct caggattaca agctcctccc gcgtcagaac 34980 

catatcccag ggaacaaccc attcctgaat cagcgtaaat cccacactgc agggaagacc 35040 

tcgcacgtaa ctcacgttgt gcattgtcaa agtgttacat tcgggcagca gcggatgatc 35100 

ctccagtatg gtagcgcggg tttctgtctc aaaaggaggt agacgatccc tactgtacgg 35160 
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agtgcgccga gacaaccgag atcgtgttgg tcgtagtgtc atgccaaatg gaacgccgga 35220 

cgtagtcata tttcctgaag caaaaccagg tgcgggcgtg acaaacagat ctgcgtctcc 35280 

ggtctcgccg cttagatcgc tctgtgtagt agttgtagta tatccactct ctcaaagcat 35340 

ccaggcgccc cctggcttcg ggttctatgt aaactccttc atgcgccgct gccctgataa 35400 

catccaccac cgcagaataa gccacaccca gccaacctac acattcgttc tgcgagtcac 35460 

acacgggagg agcgggaaga gctggaagaa ccatgttttt ttttttattc caaaagatta 35520 

tccaaaacct caaaatgaag atctattaag tgaacgcgct cccctccggt ggcgtggtca 35580 

aactctacag ccaaagaaca gataatggca tttgtaagat gttgcacaat ggcttccaaa 35640 

aggcaaacgg ccctcacgtc caagtggacg taaaggctaa acccttcagg gtgaatctcc 35700 

tctataaaca ttccagcacc ttcaaccatg cccaaataat tctcatctcg ccaccttctc 35760 

aatatatctc taagcaaatc ccgaatatta agtccggcca ttgtaaaaat ctgctccaga 35820 

gcgccctcca ccttcagcct caagcagcga atcatgattg caaaaattca ggttcctcac 35880 

agacctgtat aagattcaaa agcggaacat taacaaaaat accgcgatcc cgtaggtccc 35940 

ttcgcagggc cagctgaaca taatcgtgca ggtctgcacg gaccagcgcg gccacttccc 36000 

cgccaggaac catgacaaaa gaacccacac tgattatgac acgcatactc ggagctatgc 36060 

taaccagcgt agccccgatg taagcttgtt gcatgggcgg cgatataaaa tgcaaggtgc 36120 

tgctcaaaaa atcaggcaaa gcctcgcgca aaaaagaaag cacatcgtag tcatgctcat 3 6180 

gcagataaag gcaggtaagc tccggaacca ccacagaaaa agacaccatt tttctctcaa 36240 

acatgtctgc gggtttctgc ataaacacaa aataaaataa caaaaaaaca tttaaacatt 36300 

agaagcctgt cttacaacag gaaaaacaac ccttataagc ataagacgga ctacggccat 36360 

gccggcgtga ccgtaaaaaa actggtcacc gtgattaaaa agcaccaccg acagctcctc 36420 

ggtcatgtcc ggagtcataa tgtaagactc ggtaaacaca tcaggttgat tcacatcggt 36480 

cagtgctaaa aagcgaccga aatagcccgg gggaatacat acccgcaggc gtagagacaa 36540 

cattacagcc cccataggag gtataacaaa attaatagga gagaaaaaca cataaacacc 36600 

tgaaaaaccc tcctgcctag gcaaaatagc accctcccgc tccagaacaa catacagcgc 36660 

ttccacagcg gcagccataa cagtcagcct taccagtaaa aaagaaaacc tattaaaaaa 36720 

acaccactcg acacggcacc agctcaatca gtcacagtgt aaaaaagggc caagtgcaga 36780 

gcgagtatat ataggactaa aaaatgacgt aacggttaaa gtccacaaaa aacacccaga 36840 

aaaccgcacg cgaacctacg cccagaaacg aaagccaaaa aacccacaac ttcctcaaat 36900 

cgtcacttcc gttttcccac gttacgtcac ttcccatttt aagaaaacta caattcccaa 36960 

cacatacaag ttactccgcc ctaaaaccta cgtcacccgc cccgttccca cgccccgcgc 37020 

cacgtcacaa actccacccc ctcattatca tattggcttc aatccaaaat aaggtatatt ^7080 
attgatgatg 



37090 



<210> 5 
<211> 5955 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> NS cDNA sequence 

<£21> CDS 

<222> (1) . . . (5955) 



<400> 5 

atg gcg ccc ate acg gec tac tec caa 
Met Ala Pro lie Thr Ala Tyr Ser Gin 
1 5 

tgc ate ate act age ctt aca ggc egg 
Cys lie lie Thr Ser Leu Thr Gly Arg 
20 25 

gag gtt cag gtg gtt tec acc gca aca 



cag acg egg ggc eta ctt ggt 48 
Gin Thr Arg Gly Leu Leu Gly 
10 15 

gac aag aac cag gtc gag gga 96 
Asp Lys Asn Gin Val Glu Gly 
30 

caa tec ttc ctg gcg acc tgc 144 
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Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys 
35 40 45 

gtc aac ggc gtg tgt tgg acc gtt tac cat ggt get ggc tea aag acc 192 
Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr 
50 55 60 

tta gec ggc cca aag ggg cca ate acc cag atg tac act aat gtg gac 240 
Leu Ala Gly Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp 
65 70 75 80 

cag gac etc gtc ggc tgg cag gcg ccc ccc ggg gcg cgt tec ttg aca 288 
Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr 
85 90 95 



cca tgc acc tgt ggc age tea gac ctt tac ttg gtc acg aga cat get 
Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 
100 105 110 



336 



gac gtc att ccg gtg cgc egg egg ggc gac agt agg ggg age ctg etc 384 
Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 
115 120 125 

tec ccc agg cct gtc tec tac ttg aag ggc tct teg ggt ggt cca ctg 432 
Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu 
130 135 140 

etc tgc cct teg ggg cac get gtg ggc ate ttc egg get gec gta tgc 480 
Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys 
145 150 155 160 

acc egg ggg gtt gcg aag gcg gtg gac ttt gtg ccc gta gag tec atg 528 
Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met 
165 170 175 

gaa act act atg egg tct ccg gtc ttc acg gac aac tea tec ccc ccg 576 
Glii Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro 
180 185 190 

gec gta ccg cag tea ttt caa gtg gec cac eta cac get ccc act ggc 624 

Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly 

195 200 205 

•A; 

age ggc aag agt act aaa gtg ccg get gca tat gca gec caa ggg tac 672 

Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr 
210 215 220 

aag gtg etc gtc etc aat ccg tec gtt gec get acc tta ggg ttt ggg 720 
Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly 
225 230 235 240 

gcg tat atg tct aag gca cac ggt att gac ccc aac ate aga act ggg 768 
Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn lie Arg Thr Gly 
245 250 255 
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gta agg acc att acc aca ggc gcc ccc gtc aca tac tct acc tat ggc 
Val Arg Thr lie Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly 
260 265 270 

aag ttt ctt gcc gat ggt ggt tgc tct ggg ggc get tat gac ate ata 
Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp lie lie 
275 280 285 



acg ggc tat acg ggc gac ttt gac tea gtg ate gac tgt aac aca tgt 
Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val lie Asp Cys Asn Thr Cys 
420 425 430 



agg act ggc agg ggt agg aga ggc ate tac agg ttt gtg act ccg gga 
Arg Thr Gly Arg Gly Arg Arg Gly lie Tyr Arg Phe Val Thr Pro Gly 
465 470 475 480 



816 



864 



ata tgt gat gag tgc cat tea act gac teg act aca ate ttg ggc ate 912 
He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr lie Leu Gly lie 
290 295 300 

ggc aca gtc ctg gac caa gcg gag acg get gga gcg egg ctt gtc gtg 960 
Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val 
305 310 315 320 

etc gcc acc get acg cct ccg gga teg gtc acc gtg cca cac cca aac 
Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn 
325 330 335 

ate gag gag gtg gcc ctg tct aat act gga gag ate ccc ttc tat ggc 
lie Glu Glu Val Ala Leu Ser Asn Thr Gly Glu lie Pro Phe Tyr Gly 
340 345 350 

aaa gcc ate ccc att gaa gcc ate agg ggg gga agg cat etc att ttc 
Lys Ala lie Pro lie Glu Ala He Arg Gly Gly Arg His Leu He Phe 
355 360 365 

tgt cat tec aag aag aag tgc gac gag etc gcc gca aag ctg tea ggc 1152 
fcys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly 
370 375 380 

etc gga ate aac get gtg gcg tat tac egg ggg etc gat gtg tec gtc 1200 
Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val 
385 390 395 400 

ata cca act ate gga gac gtc gtt gtc gtg gca aca gac get ctg atg 1248 
He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met 
405 410 415 



1008 



1056 



1104 



1296 



gtc acc cag aca gtc gac ttc age ttg gat ccc acc ttc acc att gag 1344 
Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu 
435 440 445 

acg acg acc gtg cct caa gac gca gtg teg cgc teg cag egg egg ggt 1392 
Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly 
450 455 460 



1440 
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gaa egg ccc teg ggc atg ttc gat tec teg gtc ctg tgt gag tgc tat 1488 
Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
485 490 495 

gac gcg ggc tgt get tgg tac gag etc ace ccc gee gag ace teg gtt 1536 
Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val 
500 505 510 

agg ttg egg gee tac ctg aac aca cca ggg ttg ccc gtt tgc cag gac 1584 
Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 
515 520 525 

cac ctg gag ttc tgg gag agt gtc ttc aca ggc etc ace cac ata gat 1632 
His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His lie Asp 
530 535 540 

gca cac ttc ttg tec cag ace aag cag gca gga gac aac ttc ccc tac 1680 
Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr 
545 550 555 560 

ctg gta gca tac caa gee acg gtg tgc gec agg get cag gec cca cct 1728 
Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 
565 570 575 

cca tea tgg gat caa atg tgg aag tgt etc ata egg ctg aaa cct acg 1776 
Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr 
580 . 585 590 

ctg cac ggg cca aca ccc ttg ctg tac agg ctg gga gee gtc caa aat 1824 
Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn 
595 600 605 

gag gtc acc etc ace cac ccc ata acc aaa tac ate atg gca tgc atg 1872 
Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met 
610 615 620 

teg get gac ctg gag gtc gtc act age acc tgg gtg ctg gtg ggc gga 1920 
Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly 
625 630 635 640 

gtc ctt gca get ctg gec gcg tat tgc ctg aca aca ggc agt gtg gtc 1968 
\£al Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val 
645 650 655 

att gtg ggt agg att ate ttg tec ggg agg ccg get att gtt ccc gac 2016 
lie Val Gly Arg lie He Leu Ser Gly Arg Pro Ala He Val Pro Asp 
660 665 670 

agg gag ttt etc tac cag gag ttc gat gaa atg gaa gag tgc gec teg 2064 
Arg Glu Phe Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser 
675 680 685 

cac etc cct tac ate gag cag gga atg cag etc gee gag caa ttc aag 2112 
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His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys 
690 695 700 

cag aaa gcg etc ggg tta ctg caa aca gec acc aaa caa gcg gag get 2160 
Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala 
705 710 715 720 

get get ccc gtg gtg gag tec aag tgg cga gec ctt gag aca ttc tgg 2208 
Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp 
725 730 735 

gcg aag cac atg tgg aat ttc ate age ggg ata cag tac tta gca ggc 2256 
Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly 
740 745 750 

tta tec act ctg cct ggg aac ccc gca ata gca tea ttg atg gca ttc 2304 
Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe 
755 760 765 

aca gec tct ate acc age ccg etc acc acc caa agt acc etc ctg ttt 2352 
Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe 
770 775 780 

aac ate ttg ggg ggg tgg gtg get gec caa etc gee ccc ccc age gec 2400 
Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala 
785 790 795 800 

get teg get ttc gtg ggc gec ggc ate gee ggt gcg get gtt ggc age 2448 
Ala Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser 
805 810 815 

ata ggc ctt ggg aag gtg ctt gtg gac att ctg gcg ggt tat gga gca 2496 
He Gly Leu Gly Lys Val Leu Val Asp lie Leu Ala Gly Tyr Gly Ala 
820 825 830 

gga gtg gec ggc gcg etc gtg gee ttc aag gtc atg age ggc gag atg 2544 
Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met 
835 840 845 

ccc tec acc gag gac ctg gtc aat eta ctt cct gee ate etc tct cct 2592 
Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro 
850 855 860 

ggc gee ctg gtc gtc ggg gtc gtg tgt gca gca ata ctg cgt cga cac 2640 
Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His 
865 870 , 875 880 

gtg ggt ccg gga gag ggg get gtg cag tgg atg aac egg ctg ata gcg 2688 
Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala 
885 890 895 

ttc gee teg egg ggt aat cat gtt tec ccc acg cac tat gtg cct gag 2736 
Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu 
900 905 910 
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age gac gec gca gcg cgt gtt act cag ate etc tec age ctt ace ate 
Ser Asp Ala Ala Ala Arg Val Thr Gin lie Leu Ser Ser Leu Thr lie 
915 920 925 



2784 



act cag ctg ctg aaa agg etc cac cag tgg att aat gaa gac tgc tec 2832 
Thr Gin Leu Leu Lys Arg Leu His Gin Trp lie Asn Glu Asp Cys Ser 
930 935 940 

aca ccg tgt tec ggc teg tgg eta agg gat gtt tgg gac tgg ata tgc 2880 
Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp lie Cys 
945 950 955 960 



acg gtg ttg act gac ttc aag acc tgg etc cag tec aag etc ctg ccg 
Thr Val Leu Thr Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu Leu Pro 
965 970 975 



2928 



cag eta ccg gga gtc cct ttt ttc teg tgc caa cgc ggg tac aag gga 2976 
Gin Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr Lys Gly 
980 985 990 

gtc tgg egg gga gac ggc ate atg caa acc acc tgc cca tgt gga gca 3024 
Val Trp Arg Gly Asp Gly lie Met Gin Thr Thr Cys Pro Cys Gly Ala 
995 1000 1005 

cag ate acc gga cat gtc aaa aac ggt tec atg agg ate gtc ggg cct 3072 
Glri lie Thr Gly His Val Lys Asn Gly Ser Met Arg lie Val Gly Pro 
1010 1015 1020 

aag acc tgc age aac acg tgg cat gga aca ttc ccc ate aac gca tac 3120 
Lys Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro lie Asn Ala Tyr 
1025 1030 1035 1040 

acc acg ggc ccc tgc aca ccc tct cca gcg cca aac tat tct agg gcg 3168 
Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser Arg Ala 
1045 1050 1055 

ctg tgg egg gtg gec get gag gag tac gtg gag gtc acg egg gtg ggg 3216 
Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu. Val Thr Arg Val Gly 
1060 1065 1070 

gat ttc cac tac gtg acg ggc atg acc act gac aac gta aag tgc cca 3264 
Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys Cys Pro 
± 1075 1080 1085 

tgc cag gtt ccg get cct gaa ttc ttc acg gag gtg gac gga gtg egg 3312 
Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Val Asp Gly Val Arg 
1090 1095 1100 

ttg cac agg tac get ccg gcg tgc agg cct etc eta egg gag gag gtt 3360 
Leu His Arg Tyr Ala Pro Ala Cys Arg Pro Leu Leu Arg Glu Glu Val 
li05 1110 1115 1120 

aca ttc cag gtc ggg etc aac caa tac ctg gtt ggg tea cag eta cca 3408 
Thr Phe Gin Val Gly Leu Asn Gin Tyr Leu Val Gly Ser Gin Leu Pro 
1125 1130 1135 
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tgc gag ccc gaa ccg gat gta gca gtg etc act tec atg etc acc gac 3456 
Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp 
1140 1145 1150 

ccc tec cac ate aca gca gaa acg get aag cgt agg ttg gee agg ggg 3504 
Pro Ser His He Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly 
1155 1160 1165 

tct ccc ccc tec ttg gee age tct tea get age cag ttg tct gcg cct 3552 
Ser Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro 
1170 1175 1180 

tec ttg aag gcg aca tgc act acc cac cat gtc tct ccg gac get gac 3600 
Ser Leu Lys Ala Thr Cys Thr Thr His His Val Ser Pro Asp Ala Asp 
1185 1190 1195 .1200 

etc ate gag gec aac etc ctg tgg egg cag gag atg ggc ggg aac ate 3648 
Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He 
1205 1210 1215 

acc cgc gtg gag teg gag aac aag gtg gta gtc ctg gac tct ttc gac 3696 
Thr Arg Val Glu Ser Glu Asn Lys Val Val Val Leu Asp Ser Phe Asp 
1220 1225 1230 

ccg ctt cga gcg gag gag gat gag agg gaa gta tec gtt ccg gcg gag 3744 
Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu 
1235 1240 1245 

ate ctg egg aaa tec aag aag ttc ccc gca gcg atg ccc ate tgg gcg 3792 
He Leu Arg Lys Ser Lys Lys Phe Pro Ala Ala Met Pro He Trp Ala 
1250 1255 1260 

cgc ccg gat tac aac cct cca ctg tta gag tec tgg aag gac ccg gac 3840 
Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asp Pro Asp 
1265 1270 1275 1280 

tac gtc cct ccg gtg gtg cac ggg tgc ccg ttg cca cct ate aag gee 3888 
Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro He Lys Ala 
1285 1290 1295 

cct cca ata cca cct cca egg aga aag agg acg gtt gtc eta aca gag 3936 
Pro Pro He Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu 
1300 1305 1310 

tec tec gtg tct tct gee tta gcg gag etc get act aag acc ttc ggc 3984 
Ser Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly 
1315 1320 1325 

age tec gaa tea teg gee gtc gac age ggc acg gcg acc gee ctt cct 4032 
Ser Ser Glu Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Leu Pro 
1330 1335 1340 

gac cag gee tec gac gac ggt gac aaa gga tec gac gtt gag teg tac 4080 
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Asp Gin Ala Ser Asp Asp Gly Asp Lys Gly Ser Asp Val Glu Ser Tyr 
1345 1350 1355 1360 

tec tec atg ccc ccc ctt gag ggg gaa ccg ggg gac ccc gat etc agt 4128 
Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser 
1365 1370 1375 

gac ggg tct tgg tct acc gtg age gag gaa get agt gag gat gtc gtc 4176 
Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val 
1380 1385 1390 

tgc tgc tea atg tec tac aca tgg aca ggc gee ttg ate acg cca tgc 4224 
Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu lie Thr Pro Cys 
1395 1400 1405 

get gcg gag gaa age aag ctg ccc ate aac gcg ttg age aac tct ttg 4272 
Ala Ala Glu Glu Ser Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu 
1410 1415 1420 

ctg cgc cac cat aac atg gtt tat gee aca aca tct cgc age gca ggc 4320 
Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Gly 
1425 1430 1435 1440 

ctg egg cag aag aag gtc acc ttt gac aga ctg caa gtc ctg gac gac 4368 
Leu Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp 
1445 1450 1455 

cac tac egg gac gtg etc aag gag atg aag gcg aag gcg tec aca gtt 4416 
His Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val 
1460 1465 1470 

aag get aaa etc eta tec gta gag gaa gee tgc aag ctg acg ccc cca 4464 
Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro 
1475 1480 1485 

cat teg gee aaa tec aag ttt ggc tat ggg gca aag gac gtc egg aac 4512 
His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn 
1490 1495 1500 

eta tec age aag gee gtt aac cac ate cac tec gtg tgg aag gac ttg 4560 
Leu Ser Ser Lys Ala Val Asn His lie His Ser Val Trp Lys Asp Leu 
1505 1510 1515 1520 

ctg gaa gac act gtg aca cca att gac acc acc ate atg gca aaa aat 4608 
Leu Glu Asp Thr Val Thr Pro lie Asp Thr Thr lie Met Ala Lys Asn 
1525 1530 1535 

gag gtt ttc tgt gtc caa cca gag aaa gga ggc cgt aag cca gee cgc 4656 
Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 
1540 1545 1550 

ctt ate gta ttc cca gat ctg gga gtc cgt gta tgc gag aag atg gee 4704 
Leu lie Val Phe Pro Asp Leu Gly Val Arg; Val Cys Glu Lys Met Ala 
1555 1560 1565 
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etc tat gat gtg gtc tec acc ctt cct cag gtc gtg atg ggc tec tea 4752 
Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Val Val Met Gly Ser Ser 
1570 1575 1580 

tac gga ttc cag tac tct cct ggg cag cga gtc gag ttc ctg gtg aat 4800 
Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 
1585 1590 1595 1600 

acc tgg aaa tea aag aaa aac ccc atg ggc ttt tea tat gac act cgc 4848 
Thr Trp Lys Ser Lys Lys Asn Pro Met Gly Phe Ser Tyr Asp Thr Arg 
1605 1610 1615 

tgt ttc gac tea acg gtc acc gag aac gac ate cgt gtt gag gag tea 4896 
Cys Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg Val Glu Glu Ser 
1620 1625 1630 

att tac caa tgt tgt gac ttg gee ccc gaa gee aga cag gee ata aaa 4944 
lie Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala He Lys 
1635 1640 1645 

teg etc aca gag egg ctt tat ate ggg ggt cct ctg act aat tea aaa 4992 
Ser Leu Thr Glu Arg Leu Tyr He Gly Gly Pro Leu Thr Asn Ser Lys 
1650 / 1655 1660 

ggg cag aac tgc ggt tat cgc egg tgc cgc gcg age ggc gtg ctg acg 5040 
Gly Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr 
1665 1670 1675 1680 

act age tgc ggt aac acc etc aca tgt tac ttg aag gec tct gca gee 5088 
Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala 
1685 1690 1695 

tgt cga get gcg aag etc cag gac tgc acg atg etc gtg aac gga gac 5136 
Cys Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Asn Gly Asp 
1700 1705 1710 

gac ctt gtc gtt ate tgt gaa age gcg gga acc caa gag gac gcg gcg 5184 
Asp Leu Val Val He Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala 
1715 1720 1725 

age eta cga gtc ttc acg gag get atg act agg tac tct gee ccc ccc 5232 
Ser Leu Arg Val Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro 
A 1730 1735 1740 

ggg gac ccg ccc caa cca gaa tac gac ttg gag ctg ata aca tea tgt 5280 
Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys 
1745 1750 1755 1760 

tec tec aat gtg teg gtc gee cac gat gca tea ggc aaa agg gtg tac 5328 
Ser Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val Tyr 
1765 1770 1775 

tac etc acc cgt gat ccc acc acc ccc etc gca egg get gcg tgg gaa 5376 
Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu 
1780 1785 1790 
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aca get aga cac act cca gtt aac tec tgg eta ggc aac att ate atg 5424 
Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met 
1795 1800 1805 

tat gcg ccc act ttg tgg gca agg atg att ctg atg act cac ttc ttc 5472 
Tyr Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe 
1810 1815 1820 

tec ate ctt eta gca cag gag caa ctt gaa aaa gec ctg gac tgc cag 5520 
Ser He Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala Leu Asp Cys Gin 
1825 1830 1835 1840 

ate tac ggg gee tgt tac tec att gag cca ctt gac eta cct cag ate 5568 
He Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Gin He 
1845 1850 1855 

att gaa cga etc cat ggc ctt age gca ttt tea etc cat agt tac tct 5616 
He Glu Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser 
1860 1865 1870 

cca ggt gag ate aat agg gtg get tea tgc etc agg aaa ctt ggg gta 5664 
Pro Gly Glu lie Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly Val 
1875 1880 1885 

cca ccc ttg cga gtc tgg aga cat egg gee agg age gtc cgc get agg 5712 
Pro Pro Leu Arg Val Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg 
1890 1895 1900 

eta ctg tec cag ggg ggg agg gee gee act tgt ggc aag tac etc ttc 5760 
Leu Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly Lys Tyr Leu Phe 
1905 1910 1915 1920 

aac tgg gca gtg aag ace aaa etc aaa etc act cca ate ccg get gcg 5808 
Asn Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro He Pro Ala Ala 
1925 1930 1935 

tec cag ctg gac ttg tec ggc tgg ttc gtt get ggt tac age ggg gga 5856 
Ser Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly Gly 
1940 1945 1950 

gac ata tat cac age ctg tct cgt gee cga ccc cgc tgg ttc atg ctg 5904 
Asp He Tyr His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Met Leu 
1955 1960 1965 

tgc eta etc eta ctt tct gta ggg gta ggc ate tac ctg etc ccc aac 5952 
Cys Leu Leu Leu Leu Ser Val Gly Val Gly He Tyr Leu Leu Pro Asn 
1970 1975 1980 



cga 
Arg 
1985 

<210> 6 
<211> 1984 
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<212> PRT 

<213> Artificial Sequence 
<220> 

<223> NS sequence 



<400> 6 






















Ala 


Pro 


He 


Thr 


Ala 


Tvr 


Ser 


Gin 


Gin Thr Arg Gly Leu Leu 


Glv 


Cys 


1 








5 








10 




1 5 

-L. J 




lie 


lie 


Thr 


Ser 


Leu 


Thr 


Glv 


Arg 


Asp Lys Asn Gin Val 


Glu 


Glv 


Glu 








20 










25 


30 






Val 


Gin 


Val 


Val 


Ser 


Thr 


Ala 


Thr 


Gin Ser Phe Leu Ala 


Thr 




Val 






35 










40 


45 








Asn 


Glv 


Val 






Thr 


Val 


Tyr His Gly Ala Gly Ser Lys 


i. Ill 


Leu 




50 














60 








Ala 


Glv 


Ir X \J 




oiy 




Tip 

lie 


Thr 


Gin Met Tyr Thr Asn 


Val 


Asp 




65 










70 






75 






An 
o u 


Asp 


Leu 


Val 


Glv 


Trt> 


Gin 


Ala 


Pro 


Pro Gly Ala Arg Ser 


Leu 


Thr 


Pro 










85 








90 








Cys 


Thr 


Cys 


Glv 


Ser 


Ser 


Asp 


Leu Tyr Leu Val Thr Arg 


His 


Ala 


Asp 








ioo 










105 


110 






Val 


lie 


Pro 


Val 


Arcr 


Arcr 


Arcr 


Gly Asp Ser Arg Gly Ser 


Leu 


Leu 


Ser 






115 










120 


125 








Pro 


Arg 


Pro 


Val 


Ser 


Tvr 


Leu 


Lys 


Gly Ser Ser Gly Gly 


Pro 


Leu 


Leu 




130 










135 




140 








Cvs 


Pro 


Ser 


Gly 


His 


Ala 


Val 


Gly 


He Phe Arg Ala Ala 


Val 


Cys 


Thr 


145 










150 






155 






160 


Arg 


Glv 


Val 


Ala 


Lys 


Ala 


Val 


Asp 


Phe Val Pro Val Glu 


Ser 


Met 


Glu 










165 








170 




175 




Thr 


Thr 


Met 


Arcr 


Ser 


Pro 


Val 


Phe 


Thr Asp Asn Ser Ser 


Pro 


Pro 


Ala 








180 










185 


190 






Val 


Pro 


Gin 


Ser 


Phe 


Gin 


Val 


Ala 


His Leu His Ala Pro 


Thr 


Gly 


Ser 






195 










200 


205 








Gly 


Lys 


Ser 


Thr 


Lvs 


Val 


Pro 


Ala Ala Tyr Ala Ala Gin Gly 


Tyr 


Lys 




210 










215 




220 








Val 


Leu 


Val 


Leu 


Asn 


Pro 


Ser 


Val 


Ala Ala Thr Leu Gly 


Phe 


Gly 


Ala 


225 










230 






235 






240 


Tvr 


Met 


Ser 


Lvs 


Ala 


His 


Glv 


He 


Asp Pro Asn He Arg 


Thr 


Gly Val 










245 








250 




255 




Arg 


Thr 


He 


Thr 


Thr 


Gly 


Ala 


Pro 


Val Thr Tyr Ser Thr 


Tyr 


Gly 


Lys 








260 










265 


270 






Phe 


Leu 


Ala 


Asp 


Gly 


Gly 


Cys 


Ser Gly Gly Ala Tyr Asp 


He 


He 


He 






275 










280 


285 








Cys 


Asp 


Glu 


Cys 


His 


Ser 


Thr 


Asp 


Ser Thr Thr He Leu 


Gly 


He 


Gly 




290 










295 




300 








Thr 


Val 


Leu 


Asp 


Gin 


Ala 


Glu 


Thr 


Ala Gly Ala Arg Leu 


Val 


Val 


Leu 


305 










310 






315 






320 


Ala 


Thr 


Ala 


Thr 


Pro 


Pro 


Gly 


Ser 


Val Thr Val Pro His 


Pro 


Asn 


He 










325 








330 




335 




Glu 


Glu 


Val 


Ala 


Leu 


Ser 


Asn 


Thr 


Gly Glu He Pro Phe 


Tyr 


Gly 


Lys 








340 










345 


350 






Ala 


He 


Pro 


He 


Glu 


Ala 


He 


Arg 


Gly Gly Arg His Leu 


He 


Phe 


Cys 






355 










360 


365 








His 


Ser 


Lys 


Lys 


Lys 


Cys 


Asp 


Glu 


Leu Ala Ala Lys Leu 


Ser 


Gly 


Leu 




370 










375 




380 
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Glv 


He 


Asn 


Ala 


Val 


Ala 


Tyr 


Tyr 


385 










390 






Pro 


Thr 


He 


Glv 


Asp 


Val 


Val 


Val 










405 








Civ 


Tvr 


Thr 


Glv 


ASO 


Phe 


Asp 


Ser 








420 










Thr 


Gin 


Thr 


Val 


Asp 


Phe 


Ser 


Leu 






435 










440 


Thr 


Thr 


Val 


Pro 


Gin 


Asp 


Ala 


Val 




450 










455 




Thr 


Glv 


Aror 


Glv 


Arg 


Arg 


Gly 


He 


465 










470 






Arg 


Pro 


Ser 


Glv 


Met 


Phe 


Asp 


Ser 










485 








Ala 


Glv 


Cys 


Ala 


Tro 


Tyr 


Glu 


Leu 








500 










Leu 




Ala 


Tvr 


Leu 


Asn 


Thr 


Pro 






515 










520 


LeU 


Glu 


Phe 


Tiro 


Glu 


Ser 


Val 


Phe 




530 










535 




His 


Phe 


Leu 


Ser 


Gin 


Thr 


Lys 


Gin 


545 










550 






Val 


Ala 


Tvr 


Gin 


Ala 


Thr 


Val 


Cys 










565 








Ser 


TrD 


Aso 


Gin 


Met 


Trp 


Lys 


Cys 








580 










His 


Glv 


Pro 


Thr 


Pro 


Leu 


Leu 


Tvr 

Jr 






595 










600 


Val 


Thr 


Leu 


Thr 


His 


Pro 


He 


Thr 




610 










615 




Ala 


Asp 


Leu 


Glu 


Val 


Val 


Thr 


Ser 


625 










630 






Leu 


Ala 


Ala 


Leu 


Ala 


Ala 


Tyr 


Cys 










645 








Val 


Gly 


Arg 


He 


He 


Leu 


Ser 


Gly 








660 










Glu 


Phe 


Leu 


Tyr 


Gin 


Glu 


Phe 


Asp 






675 










680 


Leu 


Pro 


Tyr 


He 


Glu 


Gin 


Gly 


Met 




690 










695 




Lys 


Ala 


Leu 


Gly 


Leu 


Leu 


Gin 


Thr 


705 










710 






4.1a 


Pro 


Val 


Val 


Glu 


Ser 


Lys 


Trp 










725 








Lys 


His 


Met 


Trp 


Asn 


Phe 


He 


Ser 








740 










Ser 


Thr 


Leu 


Pro 


Gly 


Asn 


Pro 


Ala 






755 










760 


Ala 


Ser 


He 


Thr 


Ser 


Pro 


Leu 


Thr 




770 










775 




He 


Leu 


Gly 


Gly 


Trp 


Val 


Ala 


Ala 


785 










790 






Ser 


Ala 


Phe 


Val 


Gly 


Ala 


Gly 


He 










805 








Gly 


Leu 


Gly 


Lys 


Val 


Leu 


Val 


Asp 



Arg 


Gly 


Leu 


Asp 


Val 


Ser 


Val 


He 






395 










400 


Val 


Ala 


Thr 


Asp 


Ala 


Leu 


Met 


Thr 




410 










415 




Val 


He 


Asp 


Cys 


Asn 


Thr Cys 


Val 


425 










430 






Asp 


Pro 


Thr 


Phe 


Thr 


lie 


Glu 


Thr 










445 








Ser 


Arg 


Ser 


Gin 


Arg 


Arg Gly 


Arg 








460 










Tyr 


Arg 


Phe 


Val 


Thr 


Pro Gly 


Glu 






475 










480 


Ser 


val 


Leu 


Cys 


Glu 


Cys 


Tyr 


Asp 




490 










495 




Thr 


Pro 


Ala 


Glu 


Thr 


Ser 


Val 


Arg 


505 










510 






Gly 


Leu 


Pro 


Val 


Cys 


Gin Asp 


His 










525 








Thr 


Gly 


Leu 


Thr 


His 


He 


Asp 


Ala 








540 










Ala 


Gly 


Asp 


Asn 


Phe 


Pro 


Tyr 


Leu 






555 










560 


Ala 


Arg 


Ala 


Gin 


Ala 


Pro 


Pro 


Pro 




570 










575 




Leu 


He 


Arg 


Leu 


Lys 


Pro 


Thr 


Leu 


585 










590 






Arg 


Leu 


Gly 


Ala 


Val 


Gin 


Asn 


Glu 










605 








Lys 


Tyr 


He 


Met 


Ala 


Cys 


Met 


Ser 








620 










Thr 


Trp 


Val 


Leu 


Val 


Gly Gly 


Val 






635 










640 


Leu 


Thr 


Thr 


Gly 


Ser 


val 


Val 


He 




650 










655 




Arg 


Pro 


Ala 


He 


Val 


Pro 


Asp 


Arg 


665 










670 






Glu 


Met 


Glu 


Glu 


Cys 


Ala 


Ser 


His 










685 








Gin 


Leu 


Ala 


Glu 


Gin 


Phe 


Lys 


Gin 








700 










Ala 


Thr 


Lys 


Gin 


Ala 


Glu 


Ala 


Ala 






715 










720 


Arg 


Ala 


Leu 


Glu 


Thr 


Phe 


Trp 


Ala 




730 










735 




Gly 


He 


Gin 


Tyr 


Leu 


Ala 


Gly 


Leu 


745 










750 






He 


Ala 


Ser 


Leu 


Met 


Ala 


Phe 


Thr 










765 








Thr 


Gin 


Ser 


Thr 


Leu 


Leu 


Phe 


Asn 








780 










Gin 


Leu 


Ala 


Pro 


Pro 


Ser 


Ala 


Ala 






795 










800 


Ala 


Gly 


Ala 


Ala 


Val 


Gly 


Ser 


He 




810 










815- 




He 


Leu 


Ala 


Gly 


Tyr 


Gly Ala Gly 
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.820 










825 










830 






Val 


Ala 


Glv 
835 


Ala 


Leu 


Val 


Ala 


Phe 
840 


Lvs 


Val 


Met 


Ser 


Gly 
845, 


Glu 


Met 


Pro 


Ser 


Thr 
850 


Glu 


Asp 


Leu 


Val 


Asn 
855 


Leu 


Leu 


Pro 


Ala 


He 
860 


Leu 


Ser 


Pro 


Gly 


Ala 


Leu 


Val 


Val 


Gly 


Val 


Val 


Cys 


Ala 


Ala 


He 


Leu 


Arg 


Arg 


His 


Val 


865 










870 










875 




Leu 






880 


Gly 


Pro 


Gly 


Glu 


Gly 
885 


Ala 


Val 


Gin 


Trp 


Met 
890 


Asn 


Arg 


He 


Ala 
895 


Phe 


Ala 


Ser 


Arg 


Gly 
900 


Asn 


His 


Val 


Ser 


Pro 
905 


Thr 


His 


Tyr 


Val 


Pro 
910 


Glu 


Ser 


Asp 


Ala 


Ala 


Ala 


Arcr 


Val 


Thr 


Gin 


He 


Leu 


Ser 


Ser 


Leu 


Thr 


He 


Thr 






915 










920 










925 








Gin 


Leu 
930 


Leu 


Lvs 


Arg 


Leu 


His 
93 5 


Gin 


Trp 


He 


Asn 


Glu 
940 


Asp 


Cys 


Ser 


Thr 


Pro Cys 


Ser 


Gly 


Ser 


Trp 


Leu 


Arg 


Asp 


Val 


Trp 


Asp 


Trp 


He 


Cys 


Thr 


945 










950 










955 










960 


Val 


Leu 


Thr 


Asp 


Phe 
965 


Lys 


Thr 


Trp 


Leu 


Gin 
970 


Ser 


Lys 


Leu 


Leu 


Pro 
975 


Gin 


Leu 


Pro 


Gly 


Val 
980 


Pro 


Phe 


Phe 


Ser 


Cys 
985 


Gin 


Arg 


Gly 


Tyr 


Lys 
990 


Gly 


Val 


Trp 


Arg 


Gly 


Asp 


Gly 


He 


Met 


Gin 


Thr 


Thr 


Cys 


Pro 


Cys 


Gly 


Ala 


Gin 






995 










1000 








1005 






He 


Thr 


Gly 


His 


Val 


Lys 


Asn 


Gly Ser 


Met 


Arg 


He 


Val 


Gly 


Pro 


Lys 




1010 








1015 








1020 








Thr 


Cys 


Ser 


Asn 


Thr 


Trp 


His 


Gly Thr 


Phe 


Pro 


He 


Asn 


Ala 


Tyr 


Thr 


1025 








1030 








1035 








1040 


Thr Gly 


Pro 


Cys 


Thr 


Pro 


Ser 


Pro 


Ala 


Pro 


Asn 


Tyr 


Ser 


Arg 


Ala 


Leu 










1045 








1050 








1055 


Trp 


Arg 


Val 


Ala 


Ala 


Glu 


Glu 


Tyr Val 


Glu 


Val 


Thr 


Arg 


Val 


Gly 


Asp 








1060 








1065 








1070 




Phe 


His 


Tyr 


Val 


Thr 


Gly Met 


Thr 


Thr 


Asp 


Asn 


Val 


i»ys 


Cys 


Pro 


Cys 






1075 








1080 








1085 






Gin 


Val 


Pro 


Ala 


Pro 


Glu 


Phe 


Phe 


Thr 


Glu 


Val 


Asp 


Gly Val 


Arg 


Leu 




1090 








1095 








1100 








His 


Arg 


Tyr 


Ala 


Pro 


Ala 


Cys 


Arg 


Pro 


Leu 


Leu 


Arg 


Glu 


Glu 


Val 


Thr 


1105 








1110 








1115 








1120 


Phe 


Gin 


Val 


Gly Leu 


Asn 


Gin 


Tyr 


Leu 


Val 


Gly 


Ser 


Gin 


Leu 


Pro 


Cys 










1125 








1130 








1135 


Glu 


Pro 


Glu 


Pro 


Asp 


Val 


Ala 


Val 


Leu 


Thr 


Ser 


Met 


Leu 


Thr 


Asp 


Pro 








1140 








1145 








1150 




Ser 


His 


He 


Thr 


Ala 


Glu 


Thr 


Ala 


Lys 


Arg 


Arg 


Leu 


Ala 


Arg 


Gly 


Ser 






1155 








1160 








1165 






Pro 


Pro 


Ser 


Leu 


Ala 


Ser 


Ser 


Ser 


Ala 


Ser 


Gin 


Leu 


Ser 


Ala 


Pro 


Ser 




1170 








1175 








1180 








Leu 


Lys 


Ala 


Thr 


Cys 


Thr 


Thr 


His 


His 


Val 


Ser 


Pro 


Asp 


Ala 


Asp 


Leu 


1185 








1190 








1195 








1200 


He 


Glu 


Ala 


Asn 


Leu 


Leu 


Trp 


Arg 


Gin 


Glu 


Met 


Gly Gly Asn 


He 


Thr 










1205 








1210 








1215 


Arg Val 


Glu 


Ser 


Glu 


Asn 


Lys 


val 


Val 


Val 


Leu 


Asp 


Ser 


Phe 


Asp 


Pro 








1220 








1225 








1230 




Leu 


Arg 


Ala Glu 


Glu 


Asp 


Glu 


Arg 


Glu 


Val 


Ser 


Val 


Pro 


Ala 


Glu 


He 






1235 








1240 








1245 






Leu 


Arg 


Lys 


Ser 


Lys 


Lys 


Phe 


Pro 


Ala 


Ala 


Met 


Pro 


lie 


Trp 


Ala 


Arg 



1250 1255 1260 
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Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asp Pro Asp Tyr 
1265 1270 1275 1280 

Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro lie Lys Ala Pro 

1285 1290 1295 

Pro lie Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu Ser 

1300 1305 1310 

Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly Ser 

1315 1320 1325 

Ser Glu Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Leu Pro Asp 

1330 1335 1340 

Gin Ala Ser Asp Asp Gly Asp Lys Gly Ser Asp Val Glu Ser Tyr Ser 
1345 1350 1355 1360 

Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp 

1365 1370 1375 

Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val Cys 

1380 1385 1390 

Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu lie Thr Pro Cys Ala 

1395 1400 1405 

Ala Glu Glu Ser Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu 

1410 1415 1420 

Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Gly Leu 
1425 1430 1435 1440 

Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp His 

1445 1450 1455 

Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val Lys 

1460 1465 1470 

Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro His 

1475 1480 1485 

Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn Leu 

1490 1495 1500 

Ser Ser Lys Ala Val Asn His lie His Ser Val Trp Lys Asp Leu Leu 
1505 1510 1515 1520 

Glu Asp Thr Val Thr Pro lie Asp Thr Thr He Met Ala Lys Asn Glu 

1525 1530 1535 

Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 

1540 1545 1550 

lie Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 

1555 1560 1565 

Tyr Asp Val Val Ser Thr Leu Pro Gin Val Val Met Gly Ser . Ser Tyr 

1570 1575 1580 

Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn Thr 
1585 1590 1595 1600 

'Krp Lys Ser Lys Lys Asn Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 

1605 1610 1615 

Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg Val Glu Glu Ser He 

1620 1625 1630 

Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala He Lys Ser 

1635 1640 1645 

Leu Thr Glu Arg Leu Tyr He Gly Gly Pro Leu Thr Asn Ser Lys Gly 

1650 1655 1660 

Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1665 1670 1675 1680 

Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala Cys 
1685 1690 1695 
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Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Asn Gly Asp Asp 

1700 1705 1710 

Leu Val Val lie Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala Ser 

1715 1720 1725 

Leu Arg Val Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 

1730 1735 1740 

Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr Ser Cys Ser 
1745 1750 1755 1760 

Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val Tyr Tyr 

1765 1770 1775 

Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 

1780 1785 1790 

Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn lie lie Met Tyr 

1795 1800 1805 

Ala Pro Thr Leu Trp Ala Arg Met lie Leu Met Thr His Phe Phe Ser 

1810 1815 1820 

lie Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala Leu Asp Cys Gin lie 
1825 1830 1835 1840 

Tyr Gly Ala Cys Tyr Ser lie Glu Pro Leu Asp Leu Pro Gin lie lie 

1845 1850 1855 

Glu Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 

1860 1865 1870 

Gly Glu lie Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly Val Pro 

1875 1880 1885 

Pro Leu Arg Val Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu 

1890 1895 1900 

Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly Lys Tyr Leu Phe Asn 
1905 1910 1915 1920 

Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro He Pro Ala Ala Ser 

1925 1930 1935 

Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly Gly Asp 

1940 1945 1950 

lie Tyr His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Met Leu Cys 

1955 1960 1965 

Leu Leu Leu Leu Ser Val Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1970 1975 1980 

<210> 7 
<211> 4909 
<212> DNA 

<213> Artificial Sequence 
<g20> 

<223> pVU nucleic acid 
<400> 7 

tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 
cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 
ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 
accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg 240 
ctattggcca ttgcatacgt tgtatccata tcataatatg tacatttata ttggctcatg 300 
tccaacatta ccgccatgtt gacattgatt attgactagt tattaatagt aatcaattac 360 
ggggtcatta gttcatagcc catatatgga gttccgcgtt acataactta cggtaaatgg 420 
cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga cgtatgttcc 480 
catagtaacg ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac 540 
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tgcccacttg gcagtacatc aagtgtatca tatgccaagt acgcccccta ttgacgtcaa 600 

tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatggg actttcctac 660 

ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta 720 

catcaatggg cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga 780 

cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaacaa 840 

ctccgcccca ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct atataagcag 900 

agctcgttta gtgaaccgtc agatcgcctg gagacgccat ccacgctgtt ttgacctcca 960. 

tagaagacac cgggaccgat ccagcctccg cggccgggaa cggtgcattg gaacgcggat 1020 

tccccgtgcc aagagtgacg taagtaccgc ctatagactc tataggcaca cccctttggc 1080 

tcttatgcat gctatactgt ttttggcttg gggcctatac acccccgctt ccttatgcta 1140 

taggtgatgg tatagcttag cctataggtg tgggttattg accattattg accactcccc 1200 

tattggtgac gatactttcc attactaatc cataacatgg ctctttgcca caactatctc 1260 

tattggctat atgccaatac tctgtccttc agagactgac acggactctg tatttttaca 1320 

ggatggggtc ccatttatta tttacaaatt cacatataca acaacgccgt cccccgtgcc 1380 

cgcagttttt attaaacata gcgtgggatc tccacgcgaa tctcgggtac gtgttccgga 1440 

catgggctct tctccggtag cggcggagct tccacatccg agccctggtc ccatgcctcc 1500 

agcggctcat ggtcgctcgg cagctccttg ctcctaacag tggaggccag acttaggcac 1560 

agcacaatgc ccaccaccac cagtgtgccg cacaaggccg tggcggtagg gtatgtgtct 1620 

gaaaatgagc gtggagattg ggctcgcacg gctgacgcag atggaagact taaggcagcg 1680 

gcagaagaag atgcaggcag ctgagttgtt gtattctgat aagagtcaga ggtaactccc 1740 

gttgcggtgc tgttaacggt ggagggcagt gtagtctgag cagtactcgt tgctgccgcg 1800 

cgcgccacca gacataatag ctgacagact aacagactgt tcctttccat gggtcttttc 1860 

tgcagtcacc gtccttagat ctaggtacca gatatcagaa ttcagtcgac agcggccgcg 1920 

atctgctgtg ccttctagtt gccagccatc tgttgtttgc ccctcccccg tgccttcctt 1980 

gaccctggaa ggtgccactc ccactgtcct ttcctaataa aatgaggaaa ttgcatcgca 2040 

ttgtctgagt aggtgtcatt ctattctggg gggtggggtg gggcaggaca gcaaggggga 2100 

ggattgggaa gacaatagca ggcatgctgg ggatgcggtg ggctctatgg ccgctgcggc 2160 

caggtgctga agaattgacc cggttcctcc tgggccagaa agaagcaggc acatcccctt . 2220 

ctctgtgaca caccctgtcc acgcccctgg ttcttagttc cagccccact cataggacac 2280 

tcatagctca ggagggctcc gccttcaatc ccacccgcta aagtacttgg agcggtctct 2340 

ccctccctca tcagcccacc aaaccaaacc tagcctccaa gagtgggaag aaattaaagc 2400 

aagataggct attaagtgca gagggagaga aaatgcctcc aacatgtgag gaagtaatga 2460 

gagaaatcat agaatttctt ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg 2520 

gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca cagaatcagg 2580 

ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa *. 2 640 

ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg 2700 

acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc 2760 

tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc 2820 

ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc 2880 

ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 2940 

ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc 3000 

actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga 3060 

attcttgaag tggtggccta actacggcta cactagaaga acagtatttg gtatctgcgc 3120 

tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 3180 

caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 3240 

atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc 3300 

acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga tccttttaaa 3360 

ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt ctgacagtta 3420 

. ccaatgctta atcagtgagg cacctatctc agcgatctgt ctatttcgtt catccatagt 3480 

tgcctgactc gggggggggg ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc 3540 

ataccaggcc tgaatcgccc catcatccag ccagaaagtg agggagccac ggttgatgag 3600 

agctttgttg taggtggacc agttggtgat tttgaacttt tgctttgcca cggaacggtc 3660 

tgcgttgtcg ggaagatgcg tgatctgatc cttcaactca gcaaaagttc gatttattca 3720 

acaaagccgc cgtcccgtca agtcagcgta atgctctgcc agtgttacaa ccaattaacc 3780 

aattctgatt agaaaaactc atcgagcatc aaatgaaact gcaatttatt catatcagga 3840 
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ttatcaatac 
cagttccata 
atacaaccta 
gtgacgactg 
acaggccagc 
cgtgattgcg 
ggaatcgaat 
tcaggatatt 
catgcatcat 
agccagttta 
ttcagaaaca 
tgcccgacat 
aatcgcggcc 
ctgtttatgt 
taacatcaga 
tatcagggtt 
ataggggttc 
atcatgacat 



catatttttg 
ggatggcaag 
ttaatttccc 
aatccggtga 
cattacgctc 
cctgagcgag 
gcaaccggcg 
cttctaatac 
caggagtacg 
gtctgaccat 
actctggcgc 
tatcgcgagc 
tcgagcaaga 
aagcagacag 
gattttgaga 
attgtctcat 
cgcgcacatt 
taacctataa 



aaaaagccgt 
atcctggtat 
ctcgtcaaaa 
gaatggcaaa 
gtcatcaaaa 
acgaaatacg 
caggaacact 
ctggaatgct 
gataaaatgc 
ctcatctgta 
atcgggcttc 
ccatttatac 
cgtttcccgt 
ttttattgtt 
cacaacgtgg 
gagcggatac 
tccccgaaaa 
aaataggcgt 



ttctgtaatg 
cggtctgcga 
ataaggttat 
agcttatgca 
tcactcgcat 
cgatcgctgt 
gccagcgcat 
gttttcccgg 
ttgatggtcg 
acatcattgg 
ccatacaatc 
ccatataaat 
tgaatatggc 
catgatgata 
ctttcccccc 
atatttgaat 
gtgccacctg 
atcacgaggc 



aaggagaaaa 
ttccgactcg 
caagtgagaa 
tttctttcca 
caaccaaacc 
taaaaggaca 
caacaatatt 
ggatcgcagt 
gaagaggcat 
caacgctacc 
gatagattgt 
cagcatccat 
tcataacacc 
tatttttatc 
ccccccatta 
gtatttagaa 
acgtctaaga 
cctttcgtc 



ctcaccgagg 
tccaacatca 
atcaccatga 
gacttgttca 
gttattcatt 
attacaaaca 
ttcacctgaa 
ggtgagtaac 
aaattccgtc 
tttgccatgt 
cgcacctgat 
gttggaattt 
ccttgtatta 
ttgtgcaatg 
ttgaagcatt 
aaataaacaa 
aaccattatt 



3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4909 



<210> 8 
<211> 35935 
<212> DNA 

<213> Adenovirus serotype 6 



<400> 8 

catcatcaat 

ttgtgacgtg 

gatgttgcaa 

gtgtgcgccg 

taaatttggg 

agtgaaatct 

gactttgacc 

cgggtcaaag 

tgagttcctc 

tccgacaccg 

aatggccgcc 

tcctagccat 

cgaagatccc 

gcaggaaggg 

cctttcccgg 

ccttgtaccg 

cgaggatgaa 

caggtcttgt 

ctatatgagg 

tagagtggtg 

gaattttgta 

ccagaaccgg 

cgcccgacat 

ccttctaaca 

gccgtgagag 

cctgggcaac 

ttgcgtgtgt 

gagataatgt 

cgccgtgggc 

ttttctgctg 



aatatacctt 

gcgcggggcg 

gtgtggcgga 

gtgtacacag 

cgtaaccgag 

gaataatttt 

gtttacgtgg 

ttggcgtttt 

aagaggccac 

ggactgaaaa 

agtcttttgg 

tttgaaccac 

aacgaggagg 

attgacttac 

Gagcccgagc 

gaggtgatcg 

gagggtgagg 

cattatcacc 

acctgtggca 

ggtttggtgt 

ttgtgatttt 

agcctgcaag 

cacctgtgtc 

cacctcctga 

ttggtgggcg 

ctttggactt 

ggttaacgcc 

ttaacttgca 

taatcttggt 

tgcgtaactt 



attttggatt 

tgggaacggg 

acacatgtaa 

gaagtgacaa 

taagatttgg 

gtgttactca 

agactcgccc 

attattatag 

tcttgagtgc 

tgagacatat 

accagctgat 

ctacccttca 

cggtttcgca 

tcacttttcc 

agccggagca 

atcttacctg 

agtttgtgtt 

ggaggaatac 

tgtttgtcta 

ggtaattttt 

tttaaaaggt 

acctacccgc 

tagagaatgc 

gatacacccg 

tcgccaggct 

gagctgtaaa 

tttgtttgct 

tggcgtgtta 

tacatctgac 

gctggaacag 



gaagccaata 

gcgggtgacg 

gcgacggatg 

ttttcgcgcg 

ccattttcgc 

tagcgcgtaa 

aggtgttttt 

tcagctgacg 

cagcgagtag 

tatctgccac 

cgaagaggta 

cgaactgtat 

gatttttccc 

gccggcgccc 

gagagccttg 

ccacgaggct 

agattatgtg 

gggggaccca 

cagtaagtga 

tttttaattt 

cctgtgtctg 

cgtcctaaaa 

aatagtagta 

gtggtcccgc 

gtggaatgta 

cgccccaggc 

gaatgagttg 

aatggggcgg 

ctcatggagg 

agctctaaca 



tgataatgag 

tagtagtgtg 

tggcaaaagt 

gttttaggcg 

gggaaaactg 

tatttgtcta 

ctcaggtgtt 

tgtagtgtat 

agttttctcc 

ggaggtgtta 

ctggctgata 

gatttagacg 

gactctgtaa 

ggttctccgg 

ggtccggttt 

ggctttccac 

gagcaccccg 

gatattatgt 

aaattatggg 

ttacagtttt 

aacctgagcc 

tggcgcctgc 

cggatagctg 

tgtgccccat 

tcgaggactt 

cataaggtgt 

atgtaagttt 

ggcttaaagg 

cttgggagtg 

gtacctcttg 



ggggtggagt 

gcggaagtgt 

gacgtttttg 

gatgttgtag 

aataagagga 

gggccgcggg 

ttccgcgttc 

ttatacccgg 

tccgagccgc 

ttaccgaaga 

atcttccacc 

tgacggcccc 

tgttggcggt 

agccgcctca 

ctatgccaaa 

ccagtgacga 

ggcacggttg 

gttcgctttg 

cagtgggtga 

gtggtttaaa 

tgagcccgag 

tatcctgaga 

tgactccggt 

taaaccagtt 

gcttaacgag 

aaacctgtga 

aataaagggt 

gtatataatg 

tttggaagat 

gttttggagg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
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tttctgtggg gctcatccca ggcaaagtta gtctgcagaa ttaaggagga ttacaagtgg 1860 

gaatttgaag agcttttgaa atcctgtggt gagctgtttg attctttgaa tctgggtcac 1920 

caggcgctt;t tccaagagaa ggtcatcaag actttggatt tttccacacc ggggcgcgct 1980 

gcggctgctg ttgctttttt gagttttata aaggataaat ggagcgaaga aacccatctg 2040 

agcggggggt acctgctgga ttttctggcc atgcatctgt ggagagcggt tgtgagacac 2100 

aagaatcgcc tgctactgtt gtcttccgtc cgcccggcga taataccgac ggaggagcag 2160 

cagcagcagc aggaggaagc caggcggcgg cggcaggagc agagcccatg gaacccgaga 2220 

gccggcctgg accctcggga atgaatgttg tacaggtggc tgaactgtat ccagaactga 2280 

gacgcatttt gacaattaca gaggatgggc aggggctaaa gggggtaaag agggagcggg 2340 

gggcttgtga ggctacagag gaggctagga atctagcttt tagcttaatg accagacacc 2400 

gtcctgagtg tattactttt caacagatca aggataattg cgctaatgag cttgatctgc 2460 

tggcgcagaa gtattccata gagcagctga ccacttactg gctgcagcca ggggatgatt 2520 

ttgaggaggc tattagggta tatgcaaagg tggcacttag gccagattgc aagtacaaga 2580 

tcagcaaact tgtaaatatc aggaattgtt gctacatttc tgggaacggg gccgaggtgg 2640 

agatagatac ggaggatagg gtggccttta gatgtagcat gataaatatg tggccggggg 2700 

tgcttggcat ggacggggtg gttattatga atgtaaggtt tactggcccc aattttagcg 2760 

gtacggtttt cctggccaat accaacctta tcctacacgg tgtaagcttc tatgggttta 2820 

acaatacctg tgtggaagcc tggaccgatg taagggttcg gggctgtgcc ttttactgct 2880 

gctggaaggg ggtggtgtgt cgccccaaaa gcagggcttc aattaagaaa tgcctctttg 2940 

aaaggtgtac cttgggtatc ctgtctgagg gtaactccag ggtgcgccac aatgtggcct 3 000 

ccgactgtgg ttgcttcatg ctagtgaaaa gcgtggctgt gattaagcat aacatggtat 3060 

gtggcaactg cgaggacagg gcctctcaga tgctgacctg ctcggacggc aactgtcacc 3120 

tgctgaagac cattcacgta gccagccact ctcgcaaggc ctggccagtg tttgagcata 3180 

acatactgac ccgctgttcc ttgcatttgg gtaacaggag gggggtgttc ctaccttacc 3240 

aatgcaattt gagtcacact aagatattgc ttgagcccga gagcatgtcc aaggtgaacc 3300 

tgaacggggt gtttgacatg accatgaaga tctggaaggt gctgaggtac gatgagaccc 3360 

gcaccaggtg cagaccctgc gagtgtggcg gtaaacatat taggaaccag cctgtgatgc 3420 

tggatgtgac cgaggagctg aggcccgatc acttggtgct ggcctgcacc cgcgctgagt 3480 

ttggctctag cgatgaagat acagattgag gtactgaaat gtgtgggcgt ggcttaaggg 3540 

tgggaaagaa tatataaggt gggggtctta tgtagttttg tatctgtttt gcagcagccg 3 600 

ccgccgccat gagcaccaac tcgtttgatg gaagcattgt gage tea tat ttgacaaege 3660 

gcatgccccc atgggccggg gtgegtcaga atgtgatggg ctccagcatt gatggtcgee 372 0 

ccgtcctgcc cgcaaactct actaccttga cctacgagac cgtgtctgga acgccgttgg 3780 

agactgeage ctccgccgcc gcttcagccg ctgcagccac cgcccgcggg attgtgactg 3 840 

actttgettt cctgagcccg ettgeaagea gtgeagctte ccgttcatcc gcccgcgatg 3900 

acaagttgac ggctcttttg gcacaattgg attctttgac cegggaaett aatgtcgttt 3960 

ctcagcagct gttggatctg cgccagcagg tttctgccct gaaggcttcc tcccctccca 4020 

atgcggttta aaacataaat aaaaaaccag actctgtttg gatttggatc aagcaagtgt 4080 

ettgetgtet ttatttaggg gttttgcgcg cgeggtagge ccgggaccag eggtcteggt 4140 

cgttgagggt cctgtgtatt ttttccagga cgtggtaaag gtgactctgg atgttcagat 4200 

acatgggcat aagcccgtct ctggggtgga ggtagcacca ctgeagaget teatgetgeg 4260 

gggtggtgtt gtagatgatc cagtegtage aggagegctg ggcgtggtgc ctaaaaatgt 4320 

©tttcagtag caagctgatt gecaggggea ggcccttggt gtaagtgttt acaaageggt 4380 

taagctggga tgggtgcata cgtggggata tgagatgeat cttggactgt atttttaggt 4440 

tggctatgtt cccagccata tccctccggg gattcatgtt gtgeagaace accagcacag 4500 

tgtatccggt gcacttggga aatttgtcat gtagcttaga aggaaatgeg tggaagaact 4560 

tggagacgee cttgtgacct ccaagatttt ccatgcattc gtccataatg atggcaatgg 4620 

gcccacgggc ggcggcctgg gcgaagatat ttctgggatc actaaegtea tagttgtgtt 4680 

ccaggatgag ategtcatag gecattttta caaagegegg gcggagggtg ccagactgcg 4740 

gtataatggt tccatccggc ecaggggegt agttaccctc acagatttgc atttcccacg 4800 

ctttgagttc agatgggggg atcatgtcta cctgcggggc gatgaagaaa acggtttccg 4860 

gggtagggga gatcagctgg gaagaaagca ggttcctgag cagctgcgac ttaccgcagc 4920 

cggtgggccc gtaaatcaca cctattaccg ggtgcaactg gtagttaaga gagctgeage 4980 

tgccgtcatc cctgagcagg ggggecaett cgttaagcat gtccctgact cgcatgtttt 5040 

ccctgaccaa atccgccaga aggcgctcgc cgcccagcga tagcagttct tgcaaggaag 5100 
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caaagttttt caacggtttg agaccgtccg ccgtaggcat gcttttgagc gtttgaccaa 5160 

gcagttccag gcggtcccac agctcggtca cctgctctac ggcatctcga tccagcatat 5220 

ctcctcgttt cgcgggttgg ggcggctttc gctgtacggc agtagtcggt gctcgtccag 5280 

acgggccagg gtcatgtctt tccacgggcg cagggtcctc gtcagcgtag tctgggtcac 5340 

ggtgaagggg tgcgctccgg gctgcgcgct ggccagggtg cgcttgaggc tggtcctgct 5400 

ggtgctgaag cgctgccggt cttcgccctg cgcgtcggcc aggtagcatt tgaccatggt 5460 

gtcatagtcc agcccctccg cggcgtggcc cttggcgcgc agcttgccct tggaggaggc 5520 

gccgcacgag gggcagtgca gacttttgag ggcgtagagc ttgggcgcga gaaataccga 5580 

ttccggggag taggcatccg cgccgcaggc cccgcagacg gtctcgcatt ccacgagcca 5640 

ggtgagctct ggccgttcgg ggtcaaaaac caggtttccc ccatgctttt tgatgcgttt 5700 

cttacctctg gtttccatga gccggtgtcc acgctcggtg acgaaaaggc tgtccgtgtc 57 60 

cccgtataca gacttgagag gcctgtcctc gagcggtgtt ccgcggtcct cctcgtatag 5820 

aaactcggac cactctgaga caaaggctcg cgtccaggcc agcacgaagg aggctaagtg 5880 

ggaggggtag cggtcgttgt ccactagggg gtccactcgc tccagggtgt gaagacacat 5940 

gtcgccctct tcggcatcaa ggaaggtgat tggtttgtag gtgtaggcca cgtgaccggg 6000 

tgttcctgaa ggggggctat aaaagggggt gggggcgcgt tcgtcctcac tctcttccgc 6060 

atcgctgtct gcgagggcca gctgttgggg tgagtactcc ctctgaaaag cgggcatgac 6120 

ttctgcgcta agattgtcag tttccaaaaa cgaggaggat ttgatattca cctggcccgc 6180 

ggtgatgcct ttgagggtgg ccgcatccat ctggtcagaa aagacaatct ttttgttgtc 6240 

aagcttggtg gcaaacgacc cgtagagggc gttggacagc aacttggcga tggagcgcag 6300 

ggtttggttt ttgtcgcgat cggcgcgctc cttggccgcg atgtttagct gcacgtattc 6360 

gcgcgcaacg caccgccatt cgggaaagac ggtggtgcgc tcgtcgggca ccaggtgcac 6420 

gcgccaaccg cggttgtgca gggtgacaag gtcaacgctg gtggctacct ctccgcgtag 6480 

gcgctcgttg gtccagcaga ggcggccgcc cttgcgcgag cagaatggcg gtagggggtc 6540 

tagctgcgtc tcgtccgggg ggtctgcgtc cacggtaaag accccgggca gcaggcgcgc 6600 

gtcgaagtag tctatcttgc atccttgcaa gtctagcgcc tgctgccatg cgcgggcggc 6660 

aagcgcgcgc tcgtatgggt tgagtggggg accccatggc atggggtggg tgagcgcgga 6720 

ggcgtacatg ccgcaaatgt cgtaaacgta gaggggctct ctgagtattc caagatatgt 6780 

agggtagcat cttccaccgc ggatgctggc gcgcacgtaa tcgtatagtt cgtgcgaggg 6840 

agcgaggagg tcgggaccga ggttgctacg ggcgggctgc tctgctcgga agactatctg 6900 

cctgaagatg gcatgtgagt tggatgatat ggttggacgc tggaagacgt tgaagctggc 6960 

gtctgtgaga cctaccgcgt cacgcacgaa ggaggcgtag gagtcgcgca gcttgttgac 7020 

cagctcggcg gtgacctgca cgtctagggc gcagtagtcc agggtttcct tgatgatgtc 7080 

atacttatcc tgtccctttt ttttccacag ctcgcggttg aggacaaact cttcgcggtc 7140 

tttccagtac tcttggatcg gaaacccgtc ggcctccgaa cggtaagagc ctagcatgta 72 00 

gaactggttg acggcctggt aggcgcagca tcccttttct acgggtagcg cgtatgcctg 72 60 

cgcggccttc cggagcgagg tgtgggtgag cgcaaaggtg tccctgacca tgactttgag 7320 

gtactggtat ttgaagtcag tgtcgtcgca tccgccctgc tcccagagca aaaagtccgt 7380 

gcgctttttg gaacgcggat ttggcagggc gaaggtgaca tcgttgaaga gtatctttcc 7440 

cgcgcgaggc ataaagttgc gtgtgatgcg gaagggtccc ggcacctcgg aacggttgtt 7500 

aattacctgg gcggcgagca cgatctcgtc aaagccgttg atgttgtggc ccacaatgta 7560 

aagttccaag aagcgcggga tgcccttgat ggaaggcaat tttttaagtt cctcgtaggt 7620 

g^gctcttca ggggagctga gcccgtgctc tgaaagggcc cagtctgcaa gatgagggtt 7680 

ggaagcgacg aatgagctcc acaggtcacg ggccattagc atttgcaggt ggtcgcgaaa 7740 

ggtcctaaac tggcgaccta tggccatttt ttctggggtg atgcagtaga aggtaagcgg 7800 

gtcttgttcc cagcggtccc atccaaggtt cgcggctagg tctcgcgcgg cagtcactag 7860 

aggctcatct ccgccgaact tcatgaccag catgaagggc acgagctgct tcccaaaggc 7920 

ccccatccaa gtataggtct ctacatcgta ggtgacaaag agacgctcgg tgcgaggatg 7980 

cgagccgatc gggaagaact ggatctcccg ccaccaattg gaggagtggc tattgatgtg 8040 

gtgaaagtag aagtccctgc gacgggccga acactcgtgc tggcttttgt aaaaacgtgc 8100 

gcagtactgg cagcggtgca cgggctgtac atcctgcacg aggttgacct gacgaccgcg 8160 

cacaaggaag cagagtggga atttgagccc ctcgcctggc gggtttggct ggtggtcttc 8220 

tacttcggct gcttgtcctt gaccgtctgg ctgctcgagg ggagttacgg tggatcggac 8280 

caccacgccg cgcgagccca aagtccagat gtccgcgcgc ggcggtcgga gcttgatgac 8340 

aacatcgcgc agatgggagc tgtccatggt ctggagctcc cgcggcgtca ggtcaggcgg 8400 
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gagctcctgc aggtttacct cgcatagacg ggtcagggcg cgggctagat ccaggtgata 8460 

cctaatttcc aggggctggt tggtggcggc gtcgatggct tgcaagaggc cgcatccccg 8520 

cggcgcgact acggtaccgc gcggcgggcg gtgggccgcg ggggtgtcct tggatgatgc 8580 

atctaaaagc ggtgacgcgg gcgagccccc ggaggtaggg ggggctccgg acccgccggg 8640 

agagggggca ggggcacgtc ggcgccgcgc gcgggcagga gctggtgctg cgcgcgtagg 8700 

ttgctggcga acgcgacgac gcggcggttg atctcctgaa tctggcgcct ctgcgtgaag 8760 

acgacgggcc cggtgagctt gagcctgaaa gagagttcga cagaatcaat ttcggtgtcg 8820 

ttgacggcgg cctggcgcaa aatctcctgc acgtctcctg agttgtcttg ataggcgatc 8880 

tcggccatga actgctcgat ctcttcctcc tggagatctc cgcgtccggc tcgctccacg 8940 

gtggcggcga ggtcgttgga aatgcgggcc atgagctgcg agaaggcgtt gaggcctccc 9000 

tcgttccaga cgcggctgta gaccacgccc ccttcggcat cgcgggcgcg catgaccacc 9060 

tgcgcgagat tgagctccac gtgccgggcg aagacggcgt agtttcgcag gcgctgaaag 9120 

aggtagttga gggtggtggc ggtgtgttct gccacgaaga agtacataac ccagcgtcgc 9180 

aacgtggatt cgttgatatc ccccaaggcc tcaaggcgct ccatggcctc gtagaagtcc 9240 

acggcgaagt tgaaaaactg ggagttgcgc gccgacacgg ttaactcctc ctccagaaga 9300 

cggatgagct cggcgacagt gtcgcgcacc tcgcgctcaa aggctacagg ggcctcttct 9360 

tcttcttcaa tctcctcttc cataagggcc tccccttctt cttcttctgg cggcggtggg 9420 

ggagggggga cacggcggcg acgacggcgc accgggaggc ggtcgacaaa gcgctcgatc 9480 

atctccccgc ggcgacggcg catggtctcg gtgacggcgc ggccgttctc gcgggggcgc 9540 

agttggaaga cgccgcccgt catgtcccgg ttatgggttg gcggggggct gccatgcggc 9600 

agggatacgg cgctaacgat gcatctcaac aattgttgtg taggtactcc gccgccgagg 9660 

gacctgagcg agtccgcatc gaccggatcg gaaaacctct cgagaaaggc gtctaaccag 9720 

tcacagtcgc aaggtaggct gagcaccgtg gcgggcggca gcgggcggcg gtcggggttg 9780 

tttctggcgg aggtgctgct gatgatgtaa ttaaagtagg cggtcttgag acggcggatg 9840 

gtcgacagaa gcaccatgtc cttgggtccg gcctgctgaa tgcgcaggcg gtcggccatg 9900 

ccccaggctt cgttttgaca tcggcgcagg tctttgtagt agtcttgcat gagcctttct 9960 

accggcactt cttcttctcc ttcctcttgt cctgcatctc ttgcatctat cgctgcggcg 10020 

gcggcggagt ttggccgtag gtggcgccct cttcctccca tgcgtgtgac cccgaagccc 10080 

ctcatcggct gaagcagggc taggtcggcg acaacgcgct cggctaatat ggcctgctgc 10140 

acctgcgtga gggtagactg gaagtcatcc atgtccacaa agcggtggta tgcgcccgtg 10200 

ttgatggtgt aagtgcagtt ggccataacg gaccagttaa cggtctggtg acccggctgc 10260 

gagagctcgg tgtacctgag acgcgagtaa gccctcgagt caaatacgta gtcgttgcaa 10320 

gtccgcacca ggtactggta tcccaccaaa aagtgcggcg gcggctggcg gtagaggggc 10380 

cagcgtaggg tggccggggc tccgggggcg agatcttcca acataaggcg atgatatccg 10440 

tagatgtacc tggacatcca ggtgatgccg gcggcggtgg tggaggcgcg cggaaagtcg 10500 

cggacgcggt tccagatgtt gcgcagcggc aaaaagtgct ccatggtcgg gacgctctgg 10560 

ccggtcaggc gcgcgcaatc gttgacgctc tagaccgtgc aaaaggagag cctgtaagcg 1062 0 

ggcactcttc cgtggtctgg tggataaatt cgcaagggta tcatggcgga cgaccggggt 10680 

tcgagccccg tatccggccg tccgccgtga tccatgcggt taccgcccgc gtgtcgaacc 10740 

caggtgtgcg acgtcagaca acgggggagt gctccttttg gcttccttcc aggcgcggcg 10800 

gctgctgcgc tagctttttt ggccactggc cgcgcgcagc gtaagcggtt aggctggaaa 10860 

gcgaaagcat taagtggctc gctccctgta gccggagggt tattttccaa gggttgagtc 10920 

9pgggacccc cggttcgagt ctcggaccgg ccggactgcg gcgaacgggg gtttgcctcc 10980 

. ccgtcatgca agaccccgct tgcaaattcc tccggaaaca gggacgagcc ccttttttgc 11040 

ttttcccaga tgcatccggt gctgcggcag atgcgccccc ctcctcagca gcggcaagag 11100 

caagagcagc ggcagacatg cagggcaccc tcccctcctc ctaccgcgtc aggaggggcg 11160 

acatccgcgg ttgacgcggc agcagatggt gattacgaac ccccgcggcg ccgggcccgg 1122 0 

cactacctgg acttggagga gggcgagggc ctggcgcggc taggagcgcc ctctcctgag 11280 

cggtacccaa gggtgcagct gaagcgtgat acgcgtgagg cgtacgtgcc gcggcagaac 11340 

ctgtttcgcg accgcgaggg agaggagccc gaggagatgc gggatcgaaa gttccacgca 11400 

gggcgcgagc tgcggcatgg cctgaatcgc gagcggttgc tgcgcgagga ggactttgag 11460 

cccgacgcgc gaaccgggat tagtcccgcg cgcgcacacg tggcggccgc cgacctggta 11520 

accgcatacg agcagacggt gaaccaggag attaactttc aaaaaagctt taacaaccac 11580 

gtgcgtacgc ttgtggcgcg cgaggaggtg gctataggac tgatgcatct gtgggacttt li640 

gtaagcgcgc tggagcaaaa cccaaatagc aagccgctca tggcgcagct gttccttata 11700 
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gtgcagcaca gcagggacaa cgaggcattc agggatgcgc tgctaaacat agtagagccc 11760 

gagggccgct ggctgctcga tttgataaac atcctgcaga gcatagtggt gcaggagcgc 11820 

agcttgagcc tggctgacaa ggtggccgcc atcaactatt ccatgcttag cctgggcaag 11880 

ttttacgccc gcaagatata ccatacccct tacgttccca tagacaagga ggtaaagatc 11940 

gaggggttct acatgcgcat ggcgctgaag gtgcttacct tgagcgacga cctgggcgtt 12000 

tatcgcaacg agcgcatcca caaggccgtg agcgtgagcc ggcggcgcga gctcagcgac 12060 

cgcgagctga tgcacagcct gcaaagggcc ctggctggca cgggcagcgg cgatagagag 12120 

gccgagtcct actttgacgc gggcgctgac ctgcgctggg ccccaagccg acgcgccctg 12180 

gaggcagctg gggccggacc tgggctggcg gtggcacccg cgcgcgctgg caacgtcggc 12240 

ggcgtggagg aatatgacga ggacgatgag tacgagccag aggacggcga gtactaagcg 12300 

gtgatgtttc tgatcagatg atgcaagacg caacggaccc ggcggtgcgg gcggcgctgc 12360 

agagccagcc gtccggcctt aactccacgg acgactggcg ccaggtcatg gaccgcatca 12420 

tgtcgctgac tgcgcgcaat cctgacgcgt tccggcagca gccgcaggcc aaccggctct 12480 

ccgcaattct ggaagcggtg gtcccggcgc gcgcaaaccc cacgcacgag aaggtgctgg 12540 

cgatcgtaaa cgcgctggcc gaaaacaggg ccatccggcc cgacgaggcc ggcctggtct 12600 

acgacgcgct gcttcagcgc gtggctcgtt acaacagcgg caacgtgcag accaacctgg 12660 

accggctggt gggggatgtg cgcgaggccg tggcgcagcg tgagcgcgcg cagcagcagg 12720 

gcaacctggg ctccatggtt gcactaaacg ccttcctgag tacacagccc gccaacgtgc 12780 

cgcggggaca ggaggactac accaactttg tgagcgcact gcggctaatg gtgactgaga 12840 

caccgcaaag tgaggtgtac cagtctgggc cagactattt tttccagacc agtagacaag 12900 

gcctgcagac cgtaaacctg agccaggctt tcaaaaactt gcaggggctg tggggggtgc 12960 

gggctcccac aggcgaccgc gcgaccgtgt ctagcttgct gacgcccaac tcgcgcctgt 13020 

tgctgctgct aatagcgccc ttcacggaca gtggcagcgt gtcccgggac acatacctag 13080 

gtcacttgct gacactgtac cgcgaggcca taggtcaggc gcatgtggac gagcatactt 13140 

tccaggagat tacaagtgtc agccgcgcgc tggggcagga ggacacgggc agcctggagg 13200 

caaccctaaa ctacctgctg accaaccggc ggcagaagat cccctcgttg cacagtttaa 13260 

acagcgagga ggagcgcatt ttgcgctacg tgcagcagag cgtgagcctt aacctgatgc 13320 

gcgacggggt aacgcccagc gtggcgctgg acatgaccgc gcgcaacatg gaaccgggca 13380 

tgtatgcctc aaaccggccg tttatcaacc gcctaatgga ctacttgcat cgcgcggccg 13440 

ccgtgaaccc cgagtatttc accaatgcca tcttgaaccc gcactggcta ccgccccctg 13500 

gtttctacac cgggggattc gaggtgcccg agggtaacga tggattcctc tgggacgaca 13560 

tagacgacag cgtgttttcc ccgcaaccgc agaccctgct agagttgcaa cagcgcgagc 13620 

aggcagaggc ggcgctgcga aaggaaagct tccgcaggcc aagcagcttg tccgatctag 13680 

gcgctgcggc cccgcggtca gatgctagta gcccatttcc aagcttgata gggtctctta 13740 

ccagcactcg caccacccgc ccgcgcctgc tgggcgagga ggagtaccta aacaactcgc 13800 

tgctgcagcc gcagcgcgaa aaaaacctgc ctccggcatt tcccaacaac gggatagaga 13860 

gcctagtgga caagatgagt agatggaaga cgtacgcgca ggagcacagg gacgtgccag 13920 

gcccgcgccc gcccacccgt cgtcaaaggc acgaccgtca gcggggtctg gtgtgggagg 13980 

acgatgactc ggcagacgac agcagcgtcc tggatttggg agggagtggc aacccgtttg 14040 

cgcaccttcg ccccaggctg gggagaatgt tttaaaaaaa aaaaagcatg atgcaaaata 14100 

aaaaactcac caaggccatg gcaccgagcg ttggttttct tgtattcccc ttagtatgcg 14160 

gcgcgcggcg atgtatgagg aaggtcctcc tccctcctac gagagtgtgg tgagcgcggc 14220 

gpcagtggcg gcggcgctgg gttctccctt cgatgctccc ctggacccgc cgtttgtgcc 14280 

tccgcggtac ctgcggccta ccggggggag aaacagcatc cgttactctg agttggcacc 14340 

cctattcgac accacccgtg tgtacctggt ggacaacaag tcaacggatg tggcatccct 14400 

gaactaccag aacgaccaca gcaactttct gaccacggtc attcaaaaca atgactacag 14460 

cccgggggag gcaagcacac agaccatcaa tcttgacgac cggtcgcact ggggcggcga 14520 

cctgaaaacc atcctgcata ccaacatgcc aaatgtgaac gagttcatgt ttaccaataa 14580 

gtttaaggcg cgggtgatgg tgtcgcgctt gcctactaag gacaatcagg tggagctgaa 14640 

atacgagtgg gtggagttca cgctgcccga gggcaactac tccgagacca tgaccataga 14700 

ccttatgaac aacgcgatcg tggagcacta cttgaaagtg ggcagacaga acggggttct 14760 

ggaaagcgac atcggggtaa agtttgacac ccgcaacttc agactggggt ttgaccccgt 14820 

cactggtctt gtcatgcctg gggtatatac aaacgaagcc ttccatccag acatcatttt 14880 

gctgccagga tgcggggtgg acttcaccca cagccgcctg agcaacttgt tgggcatccg 14940 

caagcggcaa cccttccagg agggctttag gatcacctac gatgatctgg agggtggtaa 15000 
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cattcccgca 
gggcgggggt 
cgcggcagcc 
cacctttgcc 
cgcccccgct 
gacagaggac 
gtaccgcagc 
gaccctgctt 
agacatgatg 
ggtgggcgcc 
ctcccaactc 
ccagattttg 
tctcacagat 
cattactgac 
gccgcgcgtc 
caataacaca 
ctccgaccaa 
acgcggccgc 
gcgcaactac 
ggtgcgcgga 
ccaccgccgc 
acgtcgcacc 
cactgtgccc 
tatgactcag 
cgtgcccgtg 
gtactgttgt 
caaagaagag 
gcaggattac 
tgaacttgac 
gaaaggtcga 
tgagcgctcc 
gcttgagcag 
gctggcgttg 
gcaggtgctg 
tgacttggca 
ggaaaaaatg 
ggtggcgccg 
cagtattgcc 
ggcggatgcc 
aacggacccg 
cggcgccgcc 
cggctatcgt 
<jactggaacc 
cagggtggct 
catcgtttaa 
tttcccggtg 
cctgacgggc 
gcgcggcggt 
cggaattgca 
gaaaaatcaa 
gaatggaaga 
gaaactggca 
tgtggagcgg 
acagcagcac 
tggtagatgg 



ctgttggatg t 
ggcgcaggcg 
gcggcaatgc 
acacgggctg 
gcgcaacccg 
agcaagaaac 
tggtaccttg 
tgcactcctg 
caagaccccg 
gagctgttgc 
atccgccagt 
gcgcgcccgc 
cacgggacgc 
gccagacgcc 
ctatcgagcc 
ggctggggcc 
cacccagtgc 
actgggcgca 
acgcccacgc 
gcccggcgct 
cgacccggca 
ggccgacggg 
cccaggtcca 
ggtcgcaggg 
cgcacccgcc 
atgtatccag 
atgctccagg 
aagccccgaa 
gacgaggtgg 
cgcgtaaaac 
acccgcacct 
gccaacgagc 
ccgctggacg 
cccgcgcttg 
cccaccgtgc 
accgtggaac 
ggactgggcg 
accgccacag 
gcggtgcagg 
tggatgtttc 
agcgcgctac 
ggctacacct 
cgccgccgcc 
cgcgaaggag 
aagccggtct 
ccgggattcc 
ggcatgcgtc 
atcctgcccc 
tccgtggcct 
aataaaaagt 
catcaacttt 
agatatcggc 
cattaaaaat 
aggccagatg 
cctggcctct 



tggacgccta 
gcagcaacag 
agccggtgga 
aggagaagcg 
aggtcgagaa 
gcagttacaa 
catacaacta 
acgtaacctg 
tgaccttccg 
ccgtgcactc 
ttacctctct 
cagcccccac 
taccgctgcg 
gcacctgccc 
gcactttttg 
tgcgcttccc 
gcgtgcgcgg 
ccaccgtcga 
cgccaccagt 
atgctaaaat 
ctgccgccca 
cggccatgcg 
ggcgacgagc 
gcaacgtgta 
ccccgcgcaa 
cggcggcggc 
tcatcgcgcc 
agctaaagcg 
aactgctgca 
gtgttttgcg 
acaagcgcgt 
gcctcgggga 
agggcaaccc 
caccgtccga 
agctgatggt 
ctgggctgga 
tgcagaccgt 
agggcatgga 
cggtcgctgc 
gcgtttcagc 
tgcccgaata 
accgccccag 
gtcgccgtcg 
gcaggaccct 
ttgtggttct 
gaggaagaat 
gtgcgcacca 
tccttattcc 
tgcaggcgca 
ctggactctc 
gcgtctctgg 
accagcaata 
ttcggttcca 
ctgagggata 
ggcattagcg 



ccaggcgagc 
cagtggcagc 
ggacatgaac 
cgctgaggcc 
gcctcagaag 
cctaataagc 
cggcgaccct 
cggctcggag 
ctccacgcgc 
caagagcttc 
gacccacgtg 
catcaccacc 
caacagcatc 
ctacgtttac 
agcaagcatg 
aagcaagatg 
gcactaccgc 
tgacgccatc 
gtccacagtg 
gaagagacgg 
acgcgcggcg 
ggccgctcga 
ggccgccgca 
ttgggtgcgc 
ctagattgca 
gcgcaacgaa 
ggagatctat 
ggtcaaaaag 
cgctaccgcg 
acccggcacc 
gtatgatgag 
gtttgcctac 
aacacctagc 
agaaaagcgc 
acccaagcgc 
gcccgaggtc 
ggacgttcag 
gacacaaacg 
ggccgcgtcc 
cccccggcgc 
tgccctacat 
aagacgagca 
ccagcccgtg 
ggtgctgcca 
tgcagatatg 
gcaccgtagg 
ccggcggcgg 
actgatcgcc 
gagacactga 
acgctcgctt 
ccccgcgaca 
tgagcggtgg 
ccgttaagaa 
agttgaaaga 
gggtggtgga 



ttgaaagatg 
ggcgcggaag 
gatcatgcca 
gaagcagcgg 
aaaccggtga 
aatgacagca 
cagaccggaa 
caggtctact 
cagatcagca 
tacaacgacc 
ttcaatcgct 
gtcagtgaaa 
ggaggagtcc 
aaggccctgg 
tccatcctta 
tttggcgggg 
gcgccctggg 
gacgcggtgg 
gacgcggcca 
cggaggcgcg 
gcggccctgc 
aggctggccg 
gcagccgcgg 
gactcggtta 
agaaaaaact 
gctatgtcca 
ggccccccga 
aaaaagaaag 
cccaggcgac 
accgtagtct 
gtgtacggcg 
ggaaagcggc 
ctaaagcccg 
ggcctaaagc 
cagcgactgg 
cgcgtgcggc 
atacccacta 
tccccggttg 
aagacctcta 
ccgcgcggtt 
ccttccattg 
actacccgac 
ctggccccga 
acagcgcgct 
gccctcacct 
aggggcatgg 
cgcgcgtcgc 
gcggcgattg 
ttaaaaacaa 
ggtcctgtaa 
cggctcgcgc 
cgccttcagc 
ctatggcagc 
gcaaaatttc 
cctggccaac 



acaccgaaca 
agaactccaa 
ttcgcggcga 
ccgaagctgc 
tcaaacccct 
ccttcaccca 
tccgctcatg 
ggtcgttgcc 
actttccggt 
aggccgtcta 
ttcccgagaa 
acgttcctgc 
agcgagtgac 
gcatagtctc 
tatcgcccag 
ccaagaagcg 
gcgcgcacaa 
tggaggaggc 
ttcagaccgt 
tagcacgtcg 
ttaaccgcgc 
cgggtattgt 
ccattagtgc 
gcggcctgcg 
acttagactc 
agcgcaaaat 
agaaggaaga 
atgatgatga 
gggtacagtg 
ttacgcccgg 
acgaggacct 
ataaggacat 
taacactgca 
gcgagtctgg 
aagatgtctt 
caatcaagca 
ccagtagcac 
cctcagcggt 
cggaggtgca 
cgaggaagta 
cgcctacccc 
gccgaaccac 
tttccgtgcg 
accaccccag 
gccgcctccg 
ccggccacgg 
accgtcgcat 
gcgccgtgcc 
gttgcatgtg 
ctattttgta 
ccgttcatgg 
tggggctcgc 
aaggcctgga 
caacaaaagg 
caggcagtgc 



15060 
15120 
15180 
15240 
15300 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
16140 
16200 
16260 
16320 
16380 
16440 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
17280 
17340 
17400 
17460 
17520 
17580 
17640 
17700 
17760 
17820 
17880 
17940 
18000 
18060 
18120 
18180 
18240 
18300 
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aaaataagat taacagtaag cttgatcccc gccctcccgt agaggagcct ccaccggccg 18360 

tggagacagt gtctccagag gggcgtggcg aaaagcgtcc gcgccccgac agggaagaaa 18420 

ctctggtgac gcaaatagac gagcctccct cgtacgagga ggcactaaag caaggcctgc 18480 

ccaccacccg tcccatcgcg cccatggcta ccggagtgct gggccagcac acacccgtaa 18540 

cgctggacct gcctcccccc gccgacaccc agcagaaacc tgtgctgcca ggcccgaccg 18600 

ccgttgttgt aacccgtcct agccgcgcgt ccctgcgccg cgccgccagc ggtccgcgat 18660 

cgttgcggcc cgtagccagt ggcaactggc aaagcacact gaacagcatc gtgggtctgg 18720 

gggtgcaatc cctgaagcgc cgacgatgct tctgaatagc taacgtgtcg tatgtgtgtc 18780 

atgtatgcgt ccatgtcgcc gccagaggag ctgctgagcc gccgcgcgcc cgctttccaa 18840 

gatggctacc ccttcgatga tgccgcagtg gtcttacatg cacatctcgg gccaggacgc 18900 

ctcggagtac ctgagccccg ggctggtgca gtttgcccgc gccaccgaga cgtacttcag 18960 

cctgaataac aagtttagaa accccacggt ggcgcctacg cacgacgtga ccacagaccg 19020 

gtcccagcgt ttgacgctgc ggttcatccc tgtggaccgt gaggatactg cgtactcgta 19080 

caaggcgcgg ttcaccctag ctgtgggtga taaccgtgtg ctggacatgg cttccacgta 19140 

ctttgacatc cgcggcgtgc tggacagggg ccctactttt aagccctact ctggcactgc 19200 

ctacaacgcc ctggctccca agggtgcccc aaatccttgc gaatgggatg aagctgctac 19260 

tgctcttgaa ataaacctag aagaagagga cgatgacaac gaagacgaag tagacgagca 19320 

. agctgagcag caaaaaactc acgtatttgg gcaggcgcct tattctggta taaatattac 19380 

aaaggagggt attcaaatag gtgtcgaagg tcaaacacct aaatatgccg ataaaacatt 19440 

tcaacctgaa cctcaaatag gagaatctca gtggtacgaa actgaaatta atcatgcagc 19500 

tgggagagtc cttaaaaaga ctaccccaat gaaaccatgt tacggttcat atgcaaaacc 19560 

cacaaatgaa aatggagggc aaggcattct tgtaaagcaa caaaatggaa agctagaaag 19620 

tcaagtggaa atgcaatttt tctcaactac tgaggcgacc gcaggcaatg gtgataactt 19680 

gactcctaaa gtggtattgt acagtgaaga tgtagatata gaaaccccag acactcatat 19740 

ttcttacatg cccactatta aggaaggtaa ctcacgagaa ctaatgggcc aacaatctat 19800 

gcccaacagg cctaattaca ttgcttttag ggacaatttt attggtctaa tgtattacaa 19860 

cagcacgggt aatatgggtg ttctggcggg ccaagcatcg cagttgaatg ctgttgtaga 1992 0 

tttgcaagac agaaacacag agctttcata ccagcttttg cttgattcca ttggtgatag 19980 

aaccaggtac ttttctatgt ggaatcaggc tgttgacagc tatgatccag atgttagaat 20040 

tattgaaaat catggaactg aagatgaact tccaaattac tgctttccac tgggaggtgt 20100 

gattaataca gagactctta ccaaggtaaa acctaaaaca ggtcaggaaa atggatggga 20160 

aaaagatgct acagaatttt cagataaaaa tgaaataaga gttggaaata attttgccat 20220 

ggaaatcaat ctaaatgcca acctgtggag aaatttcctg tactccaaca tagcgctgta 20280 

tttgcccgac aagctaaagt acagtccttc caacgtaaaa atttctgata acccaaacac 20340 

ctacgactac atgaacaagc gagtggtggc tcccgggtta gtggactgct acattaacct 20400 

tggagcacgc tggtcccttg actatatgga caacgtcaac ccatttaacc accaccgcaa 20460 

tgctggcctg cgctaccgct caatgttgct gggcaatggt cgctatgtgc ccttccacat 20520 

ccaggtgcct cagaagttct ttgccattaa aaacctcctt ctcctgccgg gctcatacac 20580 

ctacgagtgg aacttcagga aggatgttaa catggttctg cagagctccc taggaaatga 20640 

cctaagggtt gacggagcca gcattaagtt tgatagcatt tgcctttacg ccaccttctt 20700 

ccccatggcc cacaacaccg cctccacgct tgaggccatg cttagaaacg acaccaacga 20760 

ccagtccttt aacgactatc tctccgccgc caacatgctc taccctatac ccgccaacgc 20820 

taccaacgtg cccatatcca tcccctcccg caactgggcg gctttccgcg gctgggcctt 20880 

cacgcgcctt aagactaagg aaaccccatc actgggctcg ggctacgacc cttattacac 20940 

ctactctggc tctataccct acctagatgg aaccttttac ctcaaccaca cctttaagaa 21000 

ggtggccatt acctttgact cttctgtcag ctggcctggc aatgaccgcc tgcttacccc 21060 

caacgagttt gaaattaagc gctcagttga cggggagggt tacaacgttg cccagtgtaa 21120 

catgaccaaa gactggttcc tggtacaaat gctagctaac tacaacattg gctaccaggg 21180 

cttctatatc ccagagagct acaaggaccg catgtactcc ttctttagaa acttccagcc 21240 

catgagccgt caggtggtgg atgatactaa atacaaggac taccaacagg tgggcatcct 21300 

acaccaacac aacaactctg gatttgttgg ctaccttgcc cccaccatgc gcgaaggaca 21360 

ggcctaccct gctaacttcc cctatccgct tataggcaag accgcagttg acagcattac 21420 

ccagaaaaag tttctttgcg atcgcaccct ttggcgcatc ccattctcca gtaactttat 21480 

gtccatgggc gcactcacag acctgggcca aaaccttctc tacgccaact ccgcccacgc 21540 

gctagacatg acttttgagg tggatcccat ggacgagccc acccttcttt atgttttgtt 21600 



43/64 



WO 03/031588 



PCT/US02/32512 



tgaagtcttt gacgtggtcc gtgtgcaccg gccgcaccgc ggcgtcatcg aaaccgtgta 21660 

cctgcgcacg cccttctcgg ccggcaacgc cacaacataa agaagcaagc aacatcaaca 21720 

acagctgccg ccatgggctc cagtgagcag gaactgaaag ccattgtcaa agatcttggt 21780 

tgtgggccat attttttggg cacctatgac aagcgctttc caggctttgt ttctccacac 21840 

aagctcgcct gcgccatagt caatacggcc ggtcgcgaga ctgggggcgt acactggatg 21900 

gcctttgcct ggaacccgca ctcaaaaaca tgctacctct ttgagccctt tggcttttct 21960 

gaccagcgac tcaagcaggt ttaccagttt gagtacgagt cactcctgcg ccgtagcgcc 22020 

attgcttctt cccccgaccg ctgtataacg ctggaaaagt ccacccaaag cgtacagggg 22080 

cccaactcgg ccgcctgtgg actattctgc tgcatgtttc tccacgcctt tgccaactgg 22140 

ccccaaactc ccatggatca caaccccacc atgaacctta ttaccggggt acccaactcc 22200 

atgctcaaca gtccccaggt acagcccacc ctgcgtcgca accaggaaca gctctacagc 22260 

ttcctggagc gccactcgcc ctacttccgc agccacagtg cgcagattag gagcgccact 22320 

tctttttgtc acttgaaaaa catgtaaaaa taatgtacta gagacacttt caataaaggc 22380 

aaatgctttt atttgtacac tctcgggtga ttatttaccc ccacccttgc cgtctgcgcc 22440 

gtttaaaaat caaaggggtt ctgccgcgca tcgctatgcg ccactggcag ggacacgttg 22500 

cgatactggt gtttagtgct ccacttaaac tcaggcacaa ccatccgcgg cagctcggtg 22560 

aagttttcac tccacaggct gcgcaccatc accaacgcgt ttagcaggtc gggcgccgat 22620 

atcttgaagt cgcagttggg gcctccgccc tgcgcgcgcg agttgcgata cacagggttg 22680 

cagcactgga acactatcag cgccgggtgg tgcacgctgg ccagcacgct cttgtcggag 22740 

atcagatccg cgtccaggtc ctccgcgttg ctcagggcga acggagtcaa ctttggtagc 22800 

tgccttccca aaaagggcgc gtgcccaggc tttgagttgc actcgcaccg tagtggcatc 22860 

aaaaggtgac cgtgcccggt ctgggcgtta ggatacagcg cctgcataaa agccttgatc 22920 

tgcttaaaag ccacctgagc ctttgcgcct tcagagaaga acatgccgca agacttgccg 22980 

gaaaactgat tggccggaca ggccgcgtcg tgcacgcagc accttgcgtc ggtgttggag 23040 

atctgcacca catttcggcc ccaccggttc ttcacgatct tggccttgct agactgctcc 23100 

ttcagcgcgc gctgcccgtt ttcgctcgtc acatccattt caatcacgtg ctccttattt 23160 

atcataatgc ttccgtgtag acacttaagc tcgccttcga tctcagcgca gcggtgcagc 23220 

cacaacgcgc agcccgtggg ctcgtgatgc ttgtaggtca cctctgcaaa cgactgcagg 23280 

tacgcctgca ggaatcgccc catcatcgtc acaaaggtct tgttgctggt gaaggtcagc 23340 

tgcaacccgc ggtgctcctc gttcagccag gtcttgcata cggccgccag agcttccact 23400 

tggtcaggca gtagtttgaa gttcgccttt agatcgttat ccacgtggta cttgtccatc 23460 

agcgcgcgcg cagcctccat gcccttctcc cacgcagaca cgatcggcac actcagcggg 2352 0 

ttcatcaccg taatttcact ttccgcttcg ctgggctctt cctcttcctc ttgcgtccgc 23580 

ataccacgcg ccactgggtc gtcttcattc agccgccgca ctgtgcgctt acctcctttg 23640 

ccatgcttga ttagcaccgg tgggttgctg aaacccacca tttgtagcgc cacatcttct 23700 

ctttcttcct cgctgtccac gattacctct ggtgatggcg ggcgctcggg cttgggagaa 23760 

gggcgcttct ttttcttctt gggcgcaatg gccaaatccg ccgccgaggt cgatggccgc 23820 

gggctgggtg tgcgcggcac cagcgcgtct tgtgatgagt cttcctcgtc ctcggactcg 23880 

atacgccgcc tcatccgctt ttttgggggc gcccggggag gcggcggcga cggggacggg 23940 

gacgacacgt cctccatggt tgggggacgt cgcgccgcac cgcgtccgcg ctcgggggtg 24000 

gtttcgcgct gctcctcttc ccgactggcc atttccttct cctataggca gaaaaagatc 24060 

atggagtcag tcgagaagaa ggacagccta accgccccct ctgagttcgc caccaccgcc 24120 

tccaccgatg ccgccaacgc gcctaccacc ttccccgtcg aggcaccccc gcttgaggag 24180 

gaggaagtga ttatcgagca ggacccaggt tttgtaagcg aagacgacga ggaccgctca 24240 

gtaccaacag aggataaaaa gcaagaccag gacaacgcag aggcaaacga ggaacaagtc 24300 

gggcgggggg acgaaaggca tggcgactac ctagatgtgg gagacgacgt gctgttgaag 24360 

catctgcagc gccagtgcgc cattatctgc gacgcgttgc aagagcgcag cgatgtgccc 24420 

ctcgccatag cggatgtcag ccttgcctac gaacgccacc tattctcacc gcgcgtaccc 24480. 

cccaaacgcc aagaaaacgg cacatgcgag cccaacccgc gcctcaactt ctaccccgta 24540 

tttgccgtgc cagaggtgct tgccacctat cacatctttt tccaaaactg caagataccc 24600 

ctatcctgcc gtgccaaccg cagccgagcg gacaagcagc tggccttgcg gcagggcgct 24660 

gtcatacctg atatcgcctc gctcaacgaa gtgccaaaaa tctttgaggg tcttggacgc 24720 

gacgagaagc gcgcggcaaa cgctctgcaa caggaaaaca gcgaaaatga aagtcactct 24780 

ggagtgttgg tggaactcga gggtgacaac gcgcgcctag ccgtactaaa acgcagcatc 24840 

gaggtcaccc actttgccta cccggcactt aacctacccc ccaaggtcat gagcacagtc 24900 
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atgagtgagc tgatcgtgcg ccgtgcgcag cccctggaga gggatgcaaa tttgcaagaa 24960 

caaacagagg agggcctacc cgcagttggc gacgagcagc tagcgcgctg gcttcaaacg 2 5020 

cgcgagcctg ccgacttgga ggagcgacgc aaactaatga tggccgcagt gctcgttacc 25080 

gtggagcttg agtgcatgca gcggttcttt gctgacccgg agatgcagcg caagctagag 25140 

gaaacattgc actacacctt tcgacagggc tacgtacgcc aggcctgcaa gatctccaac 25200 

gtggagctct gcaacctggt ctcctacctt ggaattttgc acgaaaaccg ccttgggcaa 25260 

aacgtgcttc attccacgct caagggcgag gcgcgccgcg actacgtccg cgactgcgtt 25320 

tacttatttc tatgctacac ctggcagacg gccatgggcg tttggcagca gtgcttggag 25380 

gagtgcaacc tcaaggagct gcagaaactg ctaaagcaaa acttgaagga cctatggacg 25440 

gccttcaacg agcgctccgt ggccgcgcac ctggcggaca tcattttccc cgaacgcctg 25500 

cttaaaaccc tgcaacaggg tctgccagac ttcaccagtc aaagcatgtt gcagaacttt 25560 

aggaacttta tcctagagcg ctcaggaatc ttgcccgcca cctgctgtgc acttcctagc 25620 

gactttgtgc ccattaagta ccgcgaatgc cctccgccgc tttggggcca ctgctacctt 25680 

ctgcagctag ccaactacct tgcctaccac tctgacataa tggaagacgt gagcggtgac 25740 

, ggtctactgg agtgtcactg tcgctgcaac ctatgcaccc cgcaccgctc cctggtttgc 25800 

aattcgcagc tgcttaacga aagtcaaatt atcggtacct ttgagctgca gggtccctcg 25860 

cctgacgaaa agtccgcggc tccggggttg aaactcactc cggggctgtg gacgtcggct 25920 

taccttcgca aatttgtacc tgaggactac cacgcccacg agattaggtt ctacgaagac 25980 

caatcccgcc cgccaaatgc ggagcttacc gcctgcgtca ttacccaggg ccacattctt 26040 

ggccaattgc aagccatcaa caaagcccgc caagagtttc tgctacgaaa gggacggggg 26100 

gtttacttgg acccccagtc cggcgaggag ctcaacccaa tccccccgcc gccgcagccc 26160 

tatcagcagc agccgcgggc ccttgcttcc caggatggca cccaaaaaga agctgcagct 26220 

gccgccgcca cccacggacg aggaggaata ctgggacagt caggcagagg aggttttgga 2 6280 

cgaggaggag gaggacatga tggaagactg ggagagccta gacgaggaag cttccgaggt 26340 

cgaagaggtg tcagacgaaa caccgtcacc ctcggtcgca ttcccctcgc cggcgcccca 26400 

gaaatcggca accggttcca gcatggctac aacctccgct cctcaggcgc cgccggcact 2 6460 

gcccgttcgc cgacccaacc gtagatggga caccactgga accagggccg gtaagtccaa 26520 

gcagccgccg ccgttagccc aagagcaaca acagcgccaa ggctaccgct catggcgcgg 2 6580 

gcacaagaac gccatagttg cttgcttgca agactgtggg ggcaacatct ccttcgcccg 2 6640 

ccgctttctt ctctaccatc acggcgtggc cttcccccgt aacatcctgc attactaccg 26700 

tcatctctac agcccatact gcaccggcgg cagcggcagc ggcagcaaca gcagcggcca 2 6760 

cacagaagca aaggcgaccg gatagcaaga ctctgacaaa gcccaagaaa tccacagcgg 2 6820 

cggcagcagc aggaggagga gcgctgcgtc tggcgcccaa cgaacccgta tcgacccgcg 2 6880 

agcttagaaa caggattttt cccactctgt atgctatatt tcaacagagc aggggccaag 2 6940 

aacaagagct gaaaataaaa aacaggtctc tgcgatccct cacccgcagc tgcctgtatc 27000 

acaaaagcga agatcagctt cggcgcacgc tggaagacgc ggaggctctc ttcagtaaat 27060 

actgcgcgct gactcttaag gactagtttc gcgccctttc tcaaatttaa gcgcgaaaac 27120 

tacgtcatct ccagcggcca cacccggcgc cagcacctgt cgtcagcgcc attatgagca 27180 

aggaaattcc cacgccctac atgtggagtt accagccaca aatgggactt gcggctggag 27240 

ctgcccaaga ctactcaacc cgaataaact acatgagcgc gggaccccac atgatatccc 27300 

gggtcaacgg aatccgcgcc caccgaaacc gaattctctt ggaacaggcg gctattacca 27360 

ccacacctcg taataacctt aatccccgta gttggcccgc tgccctggtg taccaggaaa 27420 

gtcccgctcc caccactgtg gtacttccca gagacgccca ggccgaagtt cagatgacta 27480 

actcaggggc gcagcttgcg ggcggctttc gtcacagggt gcggtcgccc gggcagggta 27540 

taactcacct gacaatcaga gggcgaggta ttcagctcaa cgacgagtcg gtgagctcct 27600 

cgcttggtct ccgtccggac gggacatttc agatcggcgg cgccggccgt ccttcattca 27660 

cgcctcgtca ggcaatccta actctgcaga cctcgtcctc tgagccgcgc tctggaggca 27720 

ttggaactct gcaatttatt gaggagtttg tgccatcggt ctactttaac cccttctcgg 27780 

gacctcccgg ccactatccg gatcaattta ttcctaactt tgacgcggta aaggactcgg 27840 

cggacggcta cgactgaatg ttaagtggag aggcagagca actgcgcctg aaacacctgg 27900 

tccactgtcg ccgccacaag tgctttgccc gcgactccgg tgagttttgc tactttgaat 27960 

tgcccgagga tcatatcgag ggcccggcgc acggcgtccg gcttaccgcc cagggagagc 28020 

ttgcccgtag cctgattcgg gagtttaccc agcgccccct gctagttgag cgggacaggg 28080 

gaccctgtgt tctcactgtg atttgcaact gtcctaacct tggattacat caagatcttt 28140 

gttgccatct ctgtgctgag tataataaat acagaaatta aaatatactg gggctcctat 28200 
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cgccatcctg 
ggtactttta 
ctacgagaga 
tgccgggaac 
ccagactttt 
aaaaccctta 
. caactctacg 
tcttgtgatt 
tgtgcacatt 
aggtacataa 
gattttaagg 
cttataaaat 
aagtatgctg 
ttccagggta 
attaccatgt 
actggcactt 
ctctatatta 
actaagttac 
caaaaagtta 
accattcccc 
caggcttcct 
gtccaactac 
accggactta 
aacttgggca 
ctcatctgct 
ctacacccaa 
cttacagtat 
cgcttttttg 
cagccttcac 
tcactgtggt 
tcagacacca 
tatgaaattt 
gacctccaag 
ttgctacaat 
ggtgttctgc 
acgaatagat 
agttgttgcc 
cgaaatcagc 
acggaattat 
gcatgaatca 
gtctggtaaa 
acaagttgcc 
taactcagca 
atctctgcac 
aaaaaaaaat 
ttcagcagca 
aactttctcc 
actatcttca 
gtgtatccat 
gtatccccca 
cctctagtta 
gaggccggca 
aagtcaaaca 
gtggctgccg 
ccgctaaccg 



taaacgccac 
acatctctcc 
acctctccga 
gtacgagtgc 
tccggacaga 
gggtattagg 
ggctattcta 
ctctttattc 
tgcatttatt 
tcctaggttt 
agccagcctg 
gcaccacaga 
tttatgctat 
aaagtcataa 
acatgagcaa 
tctgctgcac 
aatacaaaag 
aaagctaatg 
gcattataat 
tgaacaattg 
ggatgtcagc 
agcgacccac 
catctaccac 
tgtggtggtt 
gcctaaagcg 
acaatgatgg 
gattaaatga 
tgcgtgctcc 
agtctatttg 
catcgccttt 
tccccagtac 
actgtgactt 
cctcaaagac 
gaaaaaagcg 
agtaccatct 
gccatgaacc 
ggcggctttg 
tactttaatc 
tacagagcag 
agagctccaa 
gcaggccaaa 
aaccaagcgt 
ctcggtagaa 
ccttattaag 
aataaagcat 
cctccttgcc 
acaatctaaa 
tgttgttgca 
atgacacgga 
atgggtttca 
cctccaatgg 
accttacctc 
taaacctgga 
ccgcacctct 
tgcacgactc 



cgtcttcacc 
ctctgtgatt 
gctcagctac 
gtcaccggcc 
cctcaataac 
ccaaaggcgc 
attcaggttt 
ttatactaac 
gtcagctttt 
actcaccctt 
taatgttaca 
acatgaaaag 
ttggcagcca 
aacttttatg 
acagtataag 
tgctatgcta 
cagacgcagc 
tcaccactaa 
tagaatagga 
actctatgtg 
atctgacttt 
cctaacagag 
aaatacaccc 
ctccatagcg 
caaacgcgcc 
aatccataga 
gacatgattc 
acattggctg 
ctttacggat 
atccagtgca 
agggacagga 
ttctgctgat 
atatatcatg 
atctttccga 
tagccctagc 
acccaacttt 
tcccagccaa 
taacaggagg 
cgcctgctag 
gacatggtta 
gtcacctacg 
cagaaattgg 
accgaaggct 
accctgtgcg 
cacttactta 
ctcctcccag 
tggaatgtca 
gatgaagcgc 
aaccggtcct 
agagagtccc 
catgcttgcg 
ccaaaatgta 
aatatctgca 
aatggtcgcg 
caaacttagc 



cgcccaagca 

tacaacagtt 

tccatcagaa 

gctgcaccac 

tctgtttacc 

agctactgtg 

ctctagaatc 

gcttctctgc 

taaacgctgg 

gcgtcagccc 

ttcgcagctg 

ctgcttattc 

ggtgacacta 

tatacttttc 

ttgtggcccc 

attacagtgc 

tttattgagg 

ctgctttact 

tttaaacccc 

ggatatgctc 

ggccagcacc 

atgaccaaca 

caagtttctg 

cttatgtttg 

cgaccaccca 

ttggacggac 

ctcgagtttt 

cggtttctca 

ttgtcaccct 

ttgactgggt 

ctatagctga 

tatttgcacc 

cagattcact 

agcctggtta 

tatatatccc 

ccccgcgccc 

tcagcctcgc 

agatgactga 

aaagacgcag 

acttgcacca 

acagtaatac 

tggtcatggt 

gcattcactc 

gtctcaaaga 

aaatcagtta 

ctctggtatt 

gtttcctcct 

gcaagaccgt 

ccaactgtgc 

cctggggtac 

ctcaaaatgg 

accactgtga 

cccctcacag 

ggcaacacac 

attgccaccc 



aaccaaggcg 

tcaacccaga 

aaaacaccac 

acctaccgcc 

agaacaggag 

gggtttatga 

ggggttgggg 

ctaaggctcg 

ggtcgccacc 

acggtaccac 

aagctaatga 

gccacaaaaa 

cagagtataa 

cattttatga 

cacaaaattg 

tcgctttggt 

aaaagaaaat 

cgctgcttgc 

ccggtcattt 

cagcgctaca 

tgtcccgcgg 

caaccaacgc 

cctttgtcaa 

tatgccttat 

tctatagtcc 

tgaaacacat 

tatattactg 

catcgaagta 

cacgctcatc 

ctgtgtgcgc 

gcttcttaga 

ctatctgcgt 

cgtatatgga 

tatgcaatca 

taccttgaca 

gctatgcttc 

cccacttctc 

caccctagat 

ggcagcggcc 

gtgcaaaagg 

caccggacac 

gggagaaaag 

accttgtcaa 

tcttattccc 

gcaaatttct 

gcagcttcct 

gttcctgtcc 

ctgaagatac 

cttttcttac 

tctctttgcg 

gcaacggcct 

gcccacctct 

ttacctcaga 

tcaccatgca 

aaggacccct 



aaccttacct 
cggagtgagt 
cctccttacc 
tgaccgtaaa 
gtgagcttag 
acaattcaag 
ttattctctg 
ccgcctgctg 
caagatgatt 
ccaaaaggtg 
gtgcaccact 
caaaattggc 
tgttacagtt 
aatgtgcgac 
tgtggaaaac 
ctgtacccta 
gccttaattt 
aaaacaaatt 
cctgctcaat 
accttgaagt 
atttgttcca 
ggccgccgct 
taactgggat 
tattatgtgg 
catcattgtg 
gttcttttct 
acccttgttg 
gactgcattc 
tgcagcctca 
tttgcatatc 
attctttaat 
tttgttcccc 
atattccaag 
tctctgttat 
ttggctggaa 
cactgcaaca 
ccacccccac 
ctagaaatgg 
gagcaacagc 
ggtatctttt 
cgccttagct 
cccattacca 
ggacctgagg 
tttaactaat 
gtccagttta 
cctggctgca 
atccgcaccc 
cttcaacccc 
tcctcccttt 
cctatccgaa 
ctctctggac 
caaaaaaacc 
agccctaact 
atcacaggcc 
cacagtgtca 



28260 

28320 

28380 

28440 

28500 

28560 

28620 

28680 

28740 

28800 

28860 

28920 

28980 

29040 

29100 

29160 

29220 

29280 

29340 

29400 

29460 

29520 s 

29580 

29640 

29700 

29760 

29820 

29880 

29940 

30000 

30060 

30120 

30180 

30240 

30300 

30360 

30420 

30480 

30540 

30600 

30660 

30720 

30780 

30840 

30900 

30960 

31020 

31080 

31140 

31200 

31260 

31320 

31380 

31440 

31500 
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gaaggaaagc tagccctgca aacatcaggc cccctcacca ccaccgatag cagtaccctt 31560 

actatcactg cctcaccccc tctaactact gccactggta gcttgggcat tgacttgaaa 31620 

gagcccattt atacacaaaa tggaaaacta ggactaaagt acggggctcc tttgcatgta 31680 

acagacgacc taaacacttt gaccgtagca actggtccag gtgtgactat taataatact 31740 

tccttgcaaa ctaaagttac tggagccttg ggttttgatt cacaaggcaa tatgcaactt 31800 

aatgtagcag gaggactaag gattgattct caaaacagac gccttatact tgatgttagt 31860 

tatccgtttg atgctcaaaa ccaactaaat ctaagactag gacagggccc tctttttata 31920 

aactcagccc acaacttgga tattaactac aacaaaggcc tttacttgtt tacagcttca 31980 

aacaattcca aaaagcttga ggttaaccta agcactgcca aggggttgat gtttgacgct 32040 

acagccatag ccattaatgc aggagatggg cttgaatttg gttcacctaa tgcaccaaac 32100 

acaaatcccc tcaaaacaaa aattggccat ggcctagaat ttgattcaaa caaggctatg 32160 

gttcctaaac taggaactgg ccttagtttt gacagcacag gtgccattac agtaggaaac 32220 

aaaaataatg ataagctaac tttgtggacc acaccagctc catctcctaa ctgtagacta 32280 

aatgcagaga aagatgctaa actcactttg gtcttaacaa aatgtggcag tcaaatactt 32340 

gctacagttt cagttttggc tgttaaaggc agtttggctc caatatctgg aacagttcaa 32400 

agtgctcatc ttattataag atttgacgaa aatggagtgc tactaaacaa ttccttcctg 32460 

gacccagaat attggaactt tagaaatgga gatcttactg aaggcacagc ctatacaaac 32520 

gctgttggat ttatgcctaa cctatcagct tatccaaaat ctcacggtaa aactgccaaa 32580 

agtaacattg tcagtcaagt ttacttaaac ggagacaaaa ctaaacctgt aacactaacc 32640 

attacactaa acggtacaca ggaaacagga gacacaactc caagtgcata ctctatgtca 32700 

ttttcatggg actggtctgg ccacaactac attaatgaaa tatttgccac atcctcttac 32760 

actttttcat acattgccca agaataaaga atcgtttgtg ttatgtttca acgtgtttat 32820 

ttttcaattg cagaaaattt caagtcattt ttcattcagt agtatagccc caccaccaca 32880 

tagcttatac agatcaccgt accttaatca aactcacaga accctagtat tcaacctgcc 32940 

acctccctcc caacacacag agtacacagt cctttctccc cggctggcct taaaaagcat 33000 

catatcatgg gtaacagaca tattcttagg tgttatattc cacacggttt cctgtcgagc 33060 

caaacgctca tcagtgatat taataaactc cccgggcagc tcacttaagfc tcatgtcgct 33120 

gtccagctgc tgagccacag gctgctgtcc aacttgcggt tgcttaacgg gcggcgaagg 33180 

agaagtccac gcctacatgg gggtagagtc ataatcgtgc atcaggatag ggcggtggtg 33240 

ctgcagcagc gcgcgaataa actgctgccg ccgccgctcc gtcctgcagg aatacaacat 33300 

ggcagtggtc tcctcagcga tgattcgcac cgcccgcagc ataaggcgcc ttgtcctccg 33360 

ggcacagcag cgcaccctga tctcacttaa atcagcacag taactgcagc acagcaccac 33420 

aatattgttc aaaatcccac agtgcaaggc gctgtatcca aagctcatgg cggggaccac 33480 

agaacccacg tggccatcat accacaagcg caggtagatt aagtggcgac ccctcataaa 33540 

cacgctggac ataaacatta cctcttttgg catgttgtaa ttcaccacct cccggtacca 33 600 

tataaacctc tgattaaaca tggcgccatc caccaccatc ctaaaccagc tggccaaaac 33660 

ctgcccgccg gctatacact gcagggaacc gggactggaa caatgacagt ggagagccca 33720 

ggactcgtaa ccatggatca tcatgctcgt catgatatca atgttggcac aacacaggca 33780 

cacgtgcata cacttcctca ggattacaag ctcctcccgc gttagaacca tatcccaggg 33840 

aacaacccat tcctgaatca gcgtaaatcc cacactgcag ggaagacctc gcacgtaact 33900 

cacgttgtgc attgtcaaag tgttacattc gggcagcagc ggatgatcct ccagtatggt 33960 

agcgcgggtt tctgtctcaa aaggaggtag acgatcccta ctgtacggag tgcgccgaga 34020 

caaccgagat cgtgttggtc gtagtgtcat gccaaatgga acgccggacg tagtcatatt 34080 

tfectgaagca aaaccaggtg cgggcgtgac aaacagatct gcgtctccgg tctcgccgct 34140 

tagatcgctc tgtgtagtag ttgtagtata tccactctct caaagcatcc aggcgccccc 34200 

tggcttcggg ttctatgtaa actccttcat gcgccgctgc cctgataaca tccaccaccg 34260 

cagaataagc cacacccagc caacctacac attcgttctg cgagtcacac acgggaggag 34320 

cgggaagagc tggaagaacc atgttttttt ttttattcca aaagattatc caaaacctca 34380 

aaatgaagat ctattaagtg aacgcgctcc cctccggtgg cgtggtcaaa ctctacagcc 34440 

aaagaacaga taatggcatt tgtaagatgt tgcacaatgg cttccaaaag gcaaacggcc 34500 

ctcacgtcca agtggacgta aaggctaaac ccttcagggt gaatctcctc tataaacatt 34560 

ccagcacctt caaccatgcc caaataattc tcatctcgcc accttctcaa tatatctcta 34620 

agcaaatccc gaatattaag tccggccatt gtaaaaatct gctccagagc gccctccacc 34680 

ttcagcctca agcagcgaat catgattgca aaaattcagg ttcctcacag acctgtataa 34740 

gattcaaaag cggaacatta acaaaaatac cgcgatcccg taggtccctt cgcagggcca 34800 
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gctgaacata 
tgacaaaaga 
ccccgatgta 
tcaggcaaag 
caggtaagct 
ggtttctgca 
ttacaacagg 
cgtaaaaaaa 
gagtcataat 
cgaccgaaat 
ataggaggta 
tgcctaggca 
cctaacagtc 
gcaccagctc 
actaaaaaat 
ctacgcccag 
cccacgttac 
ccgccctaaa 
accccctcat 



atcgtgcagg 
acccacactg 
agctttgttg 
cctcgcgcaa 
ccggaaccac 
taaacacaaa 
aaaaacaacc 
ctggtcaccg 
gtaagactcg 
agcccggggg 
taacaaaatt 
aaatagcacc 
agccttacca 
aatcagtcac 
gacgtaacgg 
aaacgaaagc 
gtaacttccc 
acctacgtca 
tatcatattg 



tctgcacgga 
attatgacac 
catgggcggc 
aaaagaaagc 
cacagaaaaa 
ataaaataac 
cttataagca 
tgattaaaaa 
gtaaacacat 
aatacatacc 
aataggagag 
ctcccgctcc 
gtaaaaaaga 
agtgtaaaaa 
ttaaagtcca 
caaaaaaccc 
attttaagaa 
cccgccccgt 
gcttcaatcc 



ccagcgcggc 
gcatactcgg 
gatataaaat 
acatcgtagt 
gacaccattt 
aaaaaaacat 
taagacggac 
gcaccaccga 
caggttgatt 
cgcaggcgta 
aaaaacacat 
agaacaacat 
aaacctatta 
agggccaagt 
caaaaaacac 
acaacttcct 
aactacaatt 
tcccacgccc 
aaaataaggt 



cacttccccg 
agctatgcta 
gcaaggtgct 
catgctcatg 
ttctctcaaa 
ttaaacatta 
tacggccatg 
cagctcctcg 
catcggtcag 
gagacaacat 
aaacacctga 
acagcgcttc 
aaaaaacacc 
gcagagcgag 
ccagaaaacc 
caaatcgtca 
cccaacacat 
cgcgccacgt 
atattattga 



ccaggaacct 
accagcgtag 
gctcaaaaaa 
cagataaagg 
catgtctgcg 
gaagcctgtc 
ccggcgtgac 
gtcatgtccg 
tgctaaaaag 
tacagccccc 
aaaaccctcc 
acagcggcag 
actcgacacg 
tatatatagg 
gcacgcgaac 
cttccgtttt 
acaagttact 
cacaaactcc 
tgatg 



34860 
34920 
34980 
35040 
35100 
35160 
35220 
35280 
35340 
35400 
35460 
35520 
35580 
35640 
35700 
35760 
35820 
35880 
35935 



<210> 9 
<211> 35935 
<212> DNA 

<213> Adenovirus serotype 5 



<400> 9 

catcatcaat 

ttgtgacgtg 

gatgttgcaa 

gtgtgcgccg 

taaatttggg 

agtgaaatct 

gactttgacc 

cgggtcaaag 

tgagttcctc 

tccgacaccg 

aatggccgcc 

tcctagccat 

cgaagatccc 

gcaggaaggg 

cctttcccgg 

ccttgtaccg 

agaggatgaa 

caggtcttgt 

ctatatgagg 

tagagtggtg 

gaattttgta 

ccagaaccgg 

cgcccgacat 

ccttctaaca 

gccgtgagag 

cctgggcaac 

ttgcgtgtgt 

gagataatgt 

cgccgtgggc 



aatatacctt 
gcgcggggcg 
gtgtggcgga 
gtgtacacag 
cgtaaccgag 
gaataatttt 
gtttacgtgg 
ttggcgtttt 
aagaggccac 
ggactgaaaa 
agtcttttgg 
tttgaaccac 
aacgaggagg 
attgacttac 
cagcccgagc 
gaggtgatcg 
gagggtgagg 
cattatcacc 
acctgtggca 
ggtttggtgt 
ttgtgatttt 
agcctgcaag 
cacctgtgtc 
cacctcctga 
ttggtgggcg 
ctttggactt 
ggttaacgcc 
ttaacttgca 
taatcttggt 



attttggatt 
tgggaacggg 
acacatgtaa 
gaagtgacaa 
taagatttgg 
gtgttactca 
agactcgccc 
attattatag 
tcttgagtgc 
tgagacatat 
accagctgat 
ctacccttca 
cggtttcgca 
tcacttttcc 
agccggagca 
atcttacctg 
agtttgtgtt 
ggaggaatac 
tgtttgtcta 
ggtaattttt 
tttaaaaggt 
acctacccgc 
tagagaatgc 
gatacacccg 
tcgccaggct 
gagctgtaaa 
tttgtttgct 
tggcgtgtta 
tacatctgac 



gaagccaata 
gcgggtgacg 
gcgacggatg 
ttttcgcgcg 
ccattttcgc 
tagcgcgtaa 
aggtgttttt 
tcagctgacg 
cagcgagtag 
tatctgccac 
cgaagaggta 
cgaactgtat 
gatttttccc 
gccggcgccc 
gagagccttg 
ccacgaggct 
agattatgtg 
gggggaccca 
cagtaagtga 
tttttaattt 
cctgtgtctg 
cgtcctaaaa 
aatagtagta 
gtggtcccgc 
gtggaatgta 
cgccccaggc 
gaatgagttg 
aatggggcgg 
ctcatggagg 



tgataatgag 
tagtagtgtg 
tggcaaaagt 
gttttaggcg 
gggaaaactg 
tatttgtcta 
ctcaggtgtt 
tgtagtgtat 
agttttctcc 
ggaggtgtta 
ctggctgata 
gatttagacg 
gactctgtaa 
ggttctccgg 
ggtccggttt 
ggctttccac 
gagcaccccg 
gatattatgt 
aaattatggg 
ttacagtttt 
aacctgagcc 
tggcgcctgc 
cggatagctg 
tgtgccccat 
tcgaggactt 
cataaggtgt 
atgtaagttt 
ggcttaaagg 
cttgggagtg 



ggggtggagt 
gcggaagtgt 
gacgtttttg 
gatgttgtag 
aataagagga 
gggccgcggg 
ttccgcgttc 
ttatacccgg 
tccgagccgc 
ttaccgaaga 
atcttccacc 
tgacggcccc 
tgttggcggt 
agccgcctca 
ctatgccaaa 
ccagtgacga 
ggcacggttg 
gttcgctttg 
cagtgggtga 
gtggtttaaa 
tgagcccgag 
tatcctgaga 
tgactccggt 
taaaccagtt 
gcttaacgag 
aaacctgtga 
aataaagggt 
gtatataatg 
tttggaagat 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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ttttctgctg tgcgtaactt gctggaacag agctctaaca gtacctcttg gttttggagg 1800 

tttctgtggg gctcatccca ggcaaagtta gtctgcagaa ttaaggagga ttacaagtgg 1860 

gaatttgaag agcttttgaa atcctgtggt gagctgtttg attctttgaa tctgggtcac 1920 

caggcgcttt tccaagagaa ggtcatcaag actttggatt tttccacacc ggggcgcgct 1980 

gcggctgctg ttgctttttt gagttttata aaggataaat ggagcgaaga aacccatctg 2040 

agcggggggt acctgctgga ttttctggcc atgcatctgt ggagagcggt tgtgagacac 2100 

aagaatcgcc tgctactgtt gtcttccgtc cgcccggcga taataccgac ggaggagcag 2160 

cagcagcagc aggaggaagc caggcggcgg cggcaggagc agagcccatg gaacccgaga 2220 

gccggcctgg accctcggga atgaatgttg tacaggtggc tgaactgtat ccagaactga 2280 

gacgcatttt gacaattaca gaggatgggc aggggctaaa gggggtaaag agggagcggg 2340 

gggcttgtga ggctacagag gaggctagga atctagcttt tagcttaatg accagacacc 2400 

gtcctgagtg tattactttt caacagatca aggataattg cgctaatgag cttgatctgc 2460 

tggcgcagaa gtattccata gagcagctga ccacttactg gctgcagcca ggggatgatt 2520 

ttgaggaggc tattagggta tatgcaaagg tggcacttag gccagattgc aagtacaaga 2580 

tcagcaaact tgtaaatatc aggaattgtt gctacatttc tgggaacggg gccgaggtgg 2640 

agatagatac ggaggatagg gtggccttta gatgtagcat gataaatatg tggccggggg 2700 

tgcttggcat ggacggggtg gttattatga atgtaaggtt tactggcccc aattttagcg 2760 

gtacggtttt cctggccaat accaacctta tcctacacgg tgtaagcttc tatgggttta 2820 

acaatacctg tgtggaagcc tggaccgatg taagggttcg gggctgtgcc ttfctactgct 2880 

gctggaaggg ggtggtgtgt cgccccaaaa gcagggcttc aattaagaaa tgcctctttg 2940 
aaaggtgtac cttgggtatc ctgtctgagg gtaactccag ggtgcgccac aatgtggcct . 3000 

ccgactgtgg ttgcttcatg ctagtgaaaa gcgtggctgt gattaagcat aacatggtat 3060 

gtggcaactg cgaggacagg gcctctcaga tgctgacctg ctcggacggc aactgtcacc 3120 

tgctgaagac cattcacgta gccagccact ctcgcaaggc ctggccagtg tttgagcata 3180 

acatactgac ccgctgttcc ttgcatttgg gtaacaggag gggggtgttc ctaccttacc 3240 

aatgcaattt gagtcacact aagatattgc ttgagcccga gagcatgtcc aaggtgaacc 3300 

tgaacggggt gtttgacatg accatgaaga tctggaaggt gctgaggtac gatgagaccc 33 60 

gcaccaggtg cagaccctgc gagtgtggcg gtaaacatat taggaaccag cctgtgatgc 3420 

tggatgtgac cgaggagctg aggcccgatc acttggtgct ggcctgcacc cgcgctgagt 3480 

ttggctctag cgatgaagat acagattgag gtactgaaat gtgtgggcgt ggcttaaggg 3540 

tgggaaagaa tatataaggt gggggtctta tgtagttttg tatctgtttt gcagcagccg 3600 

ccgccgccat gagcaccaac tcgtttgatg gaagcattgt gagctcatat ttgacaacgc 3660 

gcatgccccc atgggccggg gtgcgtcaga atgtgatggg ctccagcatt gatggtcgcc 3720 

ccgtcctgcc cgcaaactct actaccttga cctacgagac cgtgtctgga acgccgttgg 3780 

agactgcagc ctccgccgcc gcttcagccg ctgcagccac cgcccgcggg attgtgactg 3840 

actttgcttt cctgagcccg cttgcaagca gtgcagcttc ccgttcatcc gcccgcgatg 3900 

acaagttgac ggctcttttg gcacaattgg attctttgac ccgggaactt aatgtcgttt 3960 

ctcagcagct gttggatctg cgccagcagg tttctgccct gaaggcttcc tcccctccca 4020 

atgcggttta aaacataaat aaaaaaccag actctgtttg gatttggatc aagcaagtgt 4080 

cttgctgtct ttatttaggg gttttgcgcg cgcggtaggc ccgggaccag cggtctcggt 4140 

cgttgagggt cctgtgtatt. ttttccagga cgtggtaaag gtgactctgg atgttcagat 4200 

acatgggcat aagcccgtct ctggggtgga ggtagcacca ctgcagagct tcatgctgcg 4260 

gggtggtgtt gtagatgatc cagtcgtagc aggagcgctg ggcgtggtgc ctaaaaatgt 4320 

ctttcagtag caagctgatt gccaggggca ggcccttggt gtaagtgttt acaaagcggt 4380 

taagctggga tgggtgcata cgtggggata tgagatgcat cttggactgt atttttaggt 4440 

tggctatgtt cccagccata tccctccggg gattcatgtt gtgcagaacc accagcacag 4500 

tgtatccggt gcacttggga aatttgtcat gtagcttaga aggaaatgcg tggaagaact 4560 

tggagacgcc cttgtgacct ccaagatttt ccatgcattc gtccataatg atggcaatgg 4620 

gcccacgggc ggcggcctgg gcgaagatat ttctgggatc actaacgtca tagttgtgtt 4680 

ccaggatgag atcgtcatag gccattttta caaagcgcgg gcggagggtg ccagactgcg 4740 

gtataatggt tccatccggc ccaggggcgt agttaccctc acagatttgc atttcccacg 4800 

ctttgagttc agatgggggg atcatgtcta cctgcggggc gatgaagaaa acggtttccg 4860 

gggtagggga gatcagctgg gaagaaagca ggttcctgag cagctgcgac ttaccgcagc 4920 

cggtgggccc gtaaatcaca cctattaccg ggtgcaactg gtagttaaga gagctgcagc 4980 

tgccgtcatc cctgagcagg ggggccactt cgttaagcat gtccctgact cgcatgtttt 5040 
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ccctgaccaa atccgccaga aggcgctcgc cgcccagcga tagcagttct tgcaaggaag 5100 

caaagttttt caacggtttg agaccgtccg ccgtaggcat gcttttgagc gtttgaccaa 5160 

gcagttccag gcggtcccac agctcggtca cctgctctac ggcatctcga tccagcatat 5220 

ctcctcgttt cgcgggttgg ggcggctttc gctgtacggc agtagtcggt gctcgtccag 5280 

acgggccagg gtcatgtctt tccacgggcg cagggtcctc gtcagcgtag tctgggtcac 5340 

ggtgaagggg tgcgctccgg gctgcgcgct ggccagggtg cgcttgaggc tggtcctgct 5400 

ggtgctgaag cgctgccggt cttcgccctg cgcgtcggcc aggtagcatt tgaccatggt 5460 

gtcatagtcc agcccctccg cggcgtggcc cttggcgcgc agcttgccct tggaggaggc 5520 

gccgcacgag gggcagtgca gacttttgag ggcgtagagc ttgggcgcga gaaataccga 5580 

ttccggggag taggcatccg cgccgcaggc cccgcagacg gtctcgcatt ccacgagcca 5640 

ggtgagctct ggccgttcgg ggtcaaaaac caggtttccc ccatgctttt tgatgcgttt 5700 

cttacctctg gtttccatga gccggtgtcc acgctcggtg acgaaaaggc tgtccgtgtc 5760 

cccgtataca gacttgagag gcctgtcctc gagcggtgtt ccgcggtcct cctcgtatag 5820 

aaactcggac cactctgaga caaaggctcg cgtccaggcc agcacgaagg aggctaagtg 5880 

ggaggggtag cggtcgttgt ccactagggg gtccactcgc tccagggtgt gaagacacat 5940 

gtcgccctct tcggcatcaa ggaaggtgat tggtttgtag gtgtaggcca cgtgaccggg 6000 

tgttcctgaa ggggggctat aaaagggggt gggggcgcgt tcgtcctcac tctcttccgc 6060 

atcgctgtct gcgagggcca gctgttgggg tgagtactcc ctctgaaaag cgggcatgac 6120 

ttctgcgcta agattgtcag tttccaaaaa cgaggaggat ttgatattca cctggcccgc 6180 

ggtgatgcct ttgagggtgg ccgcatccat ctggtcagaa aagacaatct ttttgttgtc 6240 

aagcttggtg gcaaacgacc cgtagagggc gttggacagc aacttggcga tggagcgcag 6300 

ggtttggttt ttgtcgcgat cggcgcgctc cttggccgcg atgtttagct gcacgtattc 6360 

gcgcgcaacg caccgccatt cgggaaagac ggtggtgcgc tcgtcgggca ccaggtgcac 6420 

gcgccaaccg cggttgtgca gggtgacaag gtcaacgctg gtggctacct ctccgcgtag 6480 

gcgctcgttg gtccagcaga ggcggccgcc cttgcgcgag cagaatggcg gtagggggtc 6540 

tagctgcgtc tcgtccgggg ggtctgcgtc cacggtaaag accccgggca gcaggcgcgc 6600 

gtcgaagtag tctatcttgc atccttgcaa gtctagcgcc tgctgccatg cgcgggcggc 6660 

aagcgcgcgc tcgtatgggt tgagtggggg accccatggc atggggtggg tgagcgcgga 6720 

ggcgtacatg ccgcaaatgt cgtaaacgta gaggggctct ctgagtattc caagatatgt 6780 

agggtagcat cttccaccgc ggatgctggc gcgcacgtaa tcgtatagtt cgtgcgaggg 6840 

agcgaggagg tcgggaccga ggttgctacg ggcgggctgc tctgctcgga agactatctg 6900 

cctgaagatg gcatgtgagt tggatgatat ggttggacgc tggaagacgt tgaagctggc 6960 

gtctgtgaga cctaccgcgt cacgcacgaa ggaggcgtag gagtcgcgca gcttgttgac 7020 

cagctcggcg gtgacctgca cgtctagggc gcagtagtcc agggtttcct tgatgatgtc 7080 

atacttatcc tgtccctttt ttttccacag ctcgcggttg aggacaaact cttcgcggtc 7140 

tttccagtac tcttggatcg gaaacccgtc ggcctccgaa cggtaagagc ctagcatgta 7200 

gaactggttg acggcctggt aggcgcagca tcccttttct acgggtagcg cgtatgcctg 7260 

cgcggccttc cggagcgagg tgtgggtgag cgcaaaggtg tccctgacca tgactttgag 7320 

gtactggtat ttgaagtcag tgtcgtcgca tccgccctgc tcccagagca aaaagtccgt 7380 

gcgctttttg gaacgcggat ttggcagggc gaaggtgaca tcgttgaaga gtatctttcc 7440 

cgcgcgaggc ataaagttgc gtgtgatgcg gaagggtccc ggcacctcgg aacggttgtt 7500 

aattacctgg gcggcgagca cgatctcgtc aaagccgttg atgttgtggc ccacaatgta 7560 

cfcagttccaag aagcgcggga tgcccttgat ggaaggcaat tttttaagtt cctcgtaggt 7620 

gagctcttca ggggagctga gcccgtgctc tgaaagggcc cagtctgcaa gatgagggtt 7680 

ggaagcgacg aatgagctcc acaggtcacg ggccattagc atttgcaggt ggtcgcgaaa 7740 

ggtcctaaac tggcgaccta tggccatttt ttctggggtg atgcagtaga aggtaagcgg 7 800 

gtcttgttcc cagcggtccc atccaaggtt cgcggctagg tctcgcgcgg cagtcactag 7860 

aggctcatct ccgccgaact tcatgaccag catgaagggc acgagctgct tcccaaaggc 7920 

ccccatccaa gtataggtct ctacatcgta ggtgacaaag agacgctcgg tgcgaggatg 7 980 

cgagccgatc gggaagaact ggatctcccg ccaccaattg gaggagtggc tattgatgtg 8040 

gtgaaagtag aagtccctgc gacgggccga acactcgtgc tggcttttgt aaaaacgtgc 8100 

gcagtactgg cagcggtgca cgggctgtac atcctgcacg aggttgacct gacgaccgcg 8160 

cacaaggaag cagagtggga atttgagccc ctcgcctggc gggtttggct ggtggtcttc 8220 

tacttcggct gcttgtcctt gaccgtctgg ctgctcgagg ggagttacgg tggatcggac 8280 

caccacgccg cgcgagccca aagtccagat gtccgcgcgc ggcggtcgga gcttgatgac 8340 
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aacatcgcgc agatgggagc tgtccatggt ctggagctcc cgcggcgtca ggtcaggcgg 8400 

gagctcctgc aggtttacct cgcatagacg ggtcagggcg cgggctagat ccaggtgata 8460 

cctaatttcc aggggctggt tggtggcggc gtcgatggct tgcaagaggc cgcatccccg 8520 

cggcgcgact acggtaccgc gcggcgggcg gtgggccgcg ggggtgtcct tggatgatgc 8580 

atctaaaagc ggtgacgcgg gcgagccccc ggaggtaggg ggggctccgg acccgccggg 8640 

agagggggca ggggcacgtc ggcgccgcgc gcgggcagga gctggtgctg cgcgcgtagg 8700 

ttgctggcga acgcgacgac gcggcggttg atctcctgaa tctggcgcct ctgcgtgaag 8760 

acgacgggcc cggtgagctt gagcctgaaa gagagttcga cagaatcaat ttcggtgtcg 8820 

ttgacggcgg cctggcgcaa aatctcctgc acgtctcctg agttgtcttg ataggcgatc 8880 

tcggccatga actgctcgat ctcttcctcc tggagatctc cgcgtccggc tcgctccacg 8940 

gtggcggcga ggtcgttgga aatgcgggcc atgagctgcg agaaggcgtt gaggcctccc 9000 

tcgttccaga cgcggctgta gaccacgccc ccttcggcat cgcgggcgcg catgaccacc 9060 

tgcgcgagat tgagctccac gtgccgggcg aagacggcgt agtttcgcag gcgctgaaag 9120 

aggtagttga gggtggtggc ggtgtgttct gccacgaaga agtacataac ccagcgtcgc 9180 

aacgtggatt cgttgatatc ccccaaggcc tcaaggcgct ccatggcctc gtagaagtcc 9240 

acggcgaagt tgaaaaactg ggagttgcgc gccgacacgg ttaactcctc ctccagaaga 9300 

cggatgagct cggcgacagt gtcgcgcacc tcgcgctcaa aggctacagg ggcctcttct 9360 

tcttcttcaa tctcctcttc cataagggcc tccccttctt cttcttctgg cggcggtggg 9420 

ggagggggga cacggcggcg acgacggcgc accgggaggc ggtcgacaaa gcgctcgatc 9480 

atctccccgc ggcgacggcg catggtctcg gtgacggcgc ggccgttctc gcgggggcgc 9540 

agttggaaga cgccgcccgt catgtcccgg ttatgggttg gcggggggct gccatgcggc 9600 

agggatacgg cgctaacgat gcatctcaac aattgttgtg taggtactcc gccgccgagg 9660 

gacctgagcg agtccgcatc gaccggatcg gaaaacctct cgagaaaggc gtctaaccag 9720 

tcacagtcgc aaggtaggct gagcaccgtg gcgggcggca gcgggcggcg gtcggggttg 9780 

tttctggcgg aggtgctgct gatgatgtaa ttaaagtagg cggtcttgag acggcggatg 9840 

gtcgacagaa gcaccatgtc cttgggtccg gcctgctgaa tgcgcaggcg gtcggccatg 9900 

ccccaggctt cgttttgaca tcggcgcagg tctttgtagt agtcttgcat gagcctttct 9960 

accggcactt cttcttctcc ttcctcttgt cctgcatctc ttgcatctat cgctgcggcg 10020 

gcggcggagt ttggccgtag gtggcgccct cttcctccca tgcgtgtgac cccgaagccc 10080 

ctcatcggct gaagcagggc taggtcggcg acaacgcgct cggctaatat ggcctgctgc 10140 

acctgcgtga gggtagactg gaagtcatcc atgtccacaa agcggtggta tgcgcccgtg 10200 

ttgatggtgt aagtgcagtt ggccataacg gaccagttaa cggtctggtg acccggctgc 10260 

gagagctcgg tgtacctgag acgcgagtaa gccctcgagt caaatacgta gtcgttgcaa 10320 

gtccgcacca ggtactggta tcccaccaaa aagtgcggcg gcggctggcg gtagaggggc 10380 

cagcgtaggg tggccggggc tccgggggcg agatcttcca acataaggcg atgatatccg 10440 

tagatgtacc tggacatcca ggtgatgccg gcggcggtgg tggaggcgcg cggaaagtcg 10500 

cggacgcggt tccagatgtt gcgcagcggc aaaaagtgct ccatggtcgg gacgctctgg 10560 

ccggtcaggc gcgcgcaatc gttgacgctc tagaccgtgc aaaaggagag cctgtaagcg 10620 

ggcactcttc cgtggtctgg tggataaatt cgcaagggta tcatggcgga cgaccggggt 10680 

tcgagccccg tatccggccg tccgccgtga tccatgcggt taccgcccgc gtgtcgaacc 10740 

caggtgtgcg acgtcagaca acgggggagt gctccttttg gcttccttcc aggcgcggcg 10800 

gctgctgcgc tagctttttt ggccactggc cgcgcgcagc gtaagcggtt aggctggaaa 10860 

gjgaaagcat taagtggctc gctccctgta gccggagggt tattttccaa gggttgagtc 10920 

gcgggacccc cggttcgagt ctcggaccgg ccggactgcg gcgaacgggg gtttgcctcc 10980 

ccgtcatgca agaccccgct tgcaaattcc tccggaaaca gggacgagcc ccttttttgc 11040 

ttttcccaga tgcatccggt gctgcggcag atgcgccccc ctcctcagca gcggcaagag 11100 

caagagcagc ggcagacatg cagggcaccc tcccctcctc ctaccgcgtc aggaggggcg 11160 

acatccgcgg ttgacgcggc agcagatggt gattacgaac ccccgcggcg ccgggcccgg 11220 

cactacctgg acttggagga gggcgagggc ctggcgcggc taggagcgcc ctctcctgag 11280 

cggtacccaa gggtgcagct gaagcgtgat acgcgtgagg cgtacgtgcc gcggcagaac 11340 

ctgtttcgcg accgcgaggg agaggagccc gaggagatgc gggatcgaaa gttccacgca 11400 

gggcgcgagc tgcggcatgg cctgaatcgc gagcggttgc tgcgcgagga ggactttgag 11460 

cccgacgcgc gaaccgggat tagtcccgcg cgcgcacacg tggcggccgc cgacctggta 11520 

accgcatacg agcagacggt gaaccaggag attaactttc aaaaaagctt taacaaccac 11580 

gtgcgtacgc ttgtggcgcg cgaggaggtg gctataggac tgatgcatct gtgggacttt 11640 
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gtaagcgcgc tggagcaaaa cccaaatagc aagccgctca tggcgcagct gttccttata 11700 

gtgcagcaca gcagggacaa cgaggcattc agggatgcgc tgctaaacat agtagagccc 11760 

gagggccgct ggctgctcga tttgataaac atcctgcaga gcatagtggt gcaggagcgc 11820 

agcttgagcc tggctgacaa ggtggccgcc atcaactatt ccatgcttag cctgggcaag 11880 

ttttacgccc gcaagatata ccatacccct tacgttccca tagacaagga ggtaaagatc 11940 

gaggggttct acatgcgcat ggcgctgaag gtgcttacct tgagcgacga cctgggcgtt 12000 

tatcgcaacg agcgcatcca caaggccgtg agcgtgagcc ggcggcgcga gctcagcgac 12060 

cgcgagctga tgcacagcct gcaaagggcc ctggctggca cgggcagcgg cgatagagag 12120 

gccgagtcct actttgacgc gggcgctgac ctgcgctggg ccccaagccg acgcgccctg 12180 

gaggcagctg gggccggacc tgggctggcg gtggcacccg cgcgcgctgg caacgtcggc 12240 

ggcgtggagg aatatgacga ggacgatgag tacgagccag aggacggcga gtactaagcg 12300 

gtgatgtttc tgatcagatg atgcaagacg caacggaccc ggcggtgcgg gcggcgctgc 12360 

agagccagcc gtccggcctt aactccacgg acgactggcg ccaggtcatg gaccgcatca 12420 

tgtcgctgac tgcgcgcaat cctgacgcgt tccggcagca gccgcaggcc aaccggctct 12480 

ccgcaattct ggaagcggtg gtcccggcgc gcgcaaaccc cacgcacgag aaggtgctgg 12540 

cgatcgtaaa cgcgctggcc gaaaacaggg ccatccggcc cgacgaggcc ggcctggtct 12600 

acgacgcgct gcttcagcgc gtggctcgtt acaacagcgg caacgtgcag accaacctgg 12660 

accggctggt gggggatgtg cgcgaggccg tggcgcagcg tgagcgcgcg cagcagcagg 12720 

gcaacctggg ctccatggtt gcactaaacg ccttcctgag tacacagccc gccaacgtgc 12780 

cgcggggaca ggaggactac accaactttg tgagcgcact gcggctaatg gtgactgaga 12840 

caccgcaaag tgaggtgtac cagtctgggc cagactattt tttccagacc agtagacaag 12900 

gcctgcagac cgtaaacctg agccaggctt tcaaaaactt gcaggggctg tggggggtgc 12960 

gggctcccac aggcgaccgc gcgaccgtgt ctagcttgct gacgcccaac tcgcgcctgt 13020 

tgctgctgct aatagcgccc ttcacggaca gtggcagcgt gtcccgggac acatacctag 13080 

gtcacttgct gacactgtac cgcgaggcca taggtcaggc gcatgtggac gagcatactt 13140 

tccaggagat tacaagtgtc agccgcgcgc tggggcagga ggacacgggc agcctggagg 13200 

caaccctaaa ctacctgctg accaaccggc ggcagaagat cccctcgttg cacagtttaa 13260 

acagcgagga ggagcgcatt ttgcgctacg tgcagcagag cgtgagcctt aacctgatgc 13320 

gcgacggggt aacgcccagc gtggcgctgg acatgaccgc gcgcaacatg gaaccgggca 13380 

tgtatgcctc aaaccggccg tttatcaacc gcctaatgga ctacttgcat cgcgcggccg 13440 

ccgtgaaccc cgagtatttc accaatgcca tcttgaaccc gcactggcta ccgccccctg 13500 

gtttctacac cgggggattc gaggtgcccg agggtaacga tggattcctc tgggacgaca 13560 

tagacgacag cgtgttttcc ccgcaaccgc agaccctgct agagttgcaa cagcgcgagc 13620 

aggcagaggc ggcgctgcga aaggaaagct tccgcaggcc aagcagcttg tccgatctag 13680 

gcgctgcggc cccgcggtca gatgctagta gcccatttcc aagcttgata gggtctctta 13740 

ccagcactcg caccacccgc ccgcgcctgc tgggcgagga ggagtaccta aacaactcgc 13800 

tgctgcagcc gcagcgcgaa aaaaacctgc ctccggcatt tcccaacaac gggatagaga 13860 

gcctagtgga caagatgagt agatggaaga cgtacgcgca ggagcacagg gacgtgccag 13920 

gcccgcgccc gcccacccgt cgtcaaaggc acgaccgtca gcggggtctg gtgtgggagg 13980 

acgatgactc ggcagacgac agcagcgtcc tggatttggg agggagtggc aacccgtttg 14040 

cgcaccttcg ccccaggctg gggagaatgt tttaaaaaaa aaaaagcatg atgcaaaata 14100 

aaaaactcac caaggccatg gcaccgagcg ttggttttct tgtattcccc ttagtatgcg 14160 

gqgcgcggcg atgtatgagg aaggtcctcc tccctcctac gagagtgtgg tgagcgcggc 14220 

gccagtggcg gcggcgctgg gttctccctt cgatgctccc ctggacccgc cgtttgtgcc 14280 

tccgcggtac ctgcggccta ccggggggag aaacagcatc cgttactctg agttggcacc 14340 

cctattcgac accacccgtg tgtacctggt ggacaacaag tcaacggatg tggcatccct 14400 

gaactaccag aacgaccaca gcaactttct gaccacggtc attcaaaaca atgactacag 14460 

cccgggggag gcaagcacac agaccatcaa tcttgacgac cggtcgcact ggggcggcga 14520 

cctgaaaacc atcctgcata ccaacatgcc aaatgtgaac gagttcatgt ttaccaataa 14580 

gtttaaggcg cgggtgatgg tgtcgcgctt gcctactaag gacaatcagg tggagctgaa 14640 

atacgagtgg gtggagttca cgctgcccga gggcaactac tccgagacca tgaccataga 14700 

ccttatgaac aacgcgatcg tggagcacta cttgaaagtg ggcagacaga acggggttct 14760 

ggaaagcgac atcggggtaa agtttgacac ccgcaacttc agactggggt ttgaccccgt 14820 

cactggtctc gtcatgcctg gggtatatac aaacgaagcc ttccatccag acatcatttt 14880 

gctgccagga tgcggggtgg acttcaccca cagccgcctg agcaacttgt tgggcatccg 14940 
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caagcggcaa cccttccagg agggctttag gatcacctac gatgatctgg agggtggtaa 15000 

cattcccgca ctgttggatg tggacgccta ccaggcgagc ttgaaagatg acaccgaaca 15060 

gggcgggggt ggcgcaggcg gcagcaacag cagtggcagc ggcgcggaag agaactccaa 15120 

cgcggcagcc gcggcaatgc agccggtgga ggacatgaac gatcatgcca ttcgcggcga 15180 

cacctttgcc acacgggctg aggagaagcg cgctgaggcc gaagcagcgg ccgaagctgc 15240 

cgcccccgct gcgcaacccg aggtcgagaa gcctcagaag aaaccggtga tcaaacccct 15300 

gacagaggac agcaagaaac gcagttacaa cctaataagc aatgacagca ccttcaccca 15360 

gtaccgcagc tggtaccttg catacaacta cggcgaccct cagaccggaa tccgctcatg 15420 

gaccctgctt tgcactcctg acgtaacctg cggctcggag caggtctact ggtcgttgcc 15480 

agacatgatg caagaccccg tgaccttccg ctccacgcgc cagatcagca actttccggt 15540 

ggtgggcgcc gagctgttgc ccgtgcactc caagagcttc tacaacgacc aggccgtcta 15600 

ctcccaactc atccgccagt ttacctctct gacccacgtg ttcaatcgct ttcccgagaa 15660 

ccagattttg gcgcgcccgc cagcccccac catcaccacc gtcagtgaaa acgttcctgc 15720 

tctcacagat cacgggacgc taccgctgcg caacagcatc ggaggagtcc agcgagtgac 15780 

cattactgac gccagacgcc gcacctgccc ctacgtttac aaggccctgg gcatagtctc 15840 

gccgcgcgtc ctatcgagcc gcactttttg agcaagcatg tccatcctta tatcgcccag 15900 

caataacaca ggctggggcc tgcgcttccc aagcaagatg tttggcgggg ccaagaagcg 15960 

ctccgaccaa cacccagtgc gcgtgcgcgg gcactaccgc gcgccctggg gcgcgcacaa 16020 

acgcggccgc actgggcgca ccaccgtcga tgacgccatc gacgcggtgg tggaggaggc 16080 

gcgcaactac acgcccacgc cgccaccagt gtccacagtg gacgcggcca ttcagaccgt 16140 

ggtgcgcgga gcccggcgct atgctaaaat gaagagacgg cggaggcgcg tagcacgtcg 16200 

ccaccgccgc cgacccggca ctgccgccca acgcgcggcg gcggccctgc ttaaccgcgc 162 60 

acgtcgcacc ggccgacggg cggccatgcg ggccgctcga aggctggccg cgggtattgt 16320 

cactgtgccc cccaggtcca ggcgacgagc ggccgccgca gcagccgcgg ccattagtgc 16380 

tatgactcag ggtcgcaggg gcaacgtgta ttgggtgcgc gactcggtta gcggcctgcg 16440 

cgtgcccgtg cgcacccgcc ccccgcgcaa ctagattgca agaaaaaact acttagactc 16500 

gtactgttgt atgtatccag cggcggcggc gcgcaacgaa gctatgtcca agcgcaaaat 16560 

caaagaagag atgctccagg tcatcgcgcc ggagatctat ggccccccga agaaggaaga 16620 

gcaggattac aagccccgaa agctaaagcg ggtcaaaaag aaaaagaaag atgatgatga 16680 

tgaacttgac gacgaggtgg aactgctgca cgctaccgcg cccaggcgac gggtacagtg 16740 

gaaaggtcga cgcgtaaaac gtgttttgcg acccggcacc accgtagtct ttacgcccgg 16800 

tgagcgctcc acccgcacct acaagcgcgt gtatgatgag gtgtacggcg acgaggacct 168 60 

gcttgagcag gccaacgagc gcctcgggga gtttgcctac ggaaagcggc ataaggacat 16920 

gctggcgttg ccgctggacg agggcaaccc aacacctagc ctaaagcccg taacactgca 16980 

gcaggtgctg cccgcgcttg caccgtccga agaaaagcgc ggcctaaagc gcgagtctgg 17040 

tgacttggca cccaccgtgc agctgatggt acccaagcgc cagcgactgg aagatgtctt 17100 

ggaaaaaatg accgtggaac ctgggctgga gcccgaggtc cgcgtgcggc caatcaagca 17160 

ggtggcgccg ggactgggcg tgcagaccgt ggacgttcag atacccacta ccagtagcac 17220 

cagtattgcc accgccacag agggcatgga gacacaaacg tccccggttg cctcagcggt 17280 

ggcggatgcc gcggtgcagg cggtcgctgc ggccgcgtcc aagacctcta cggaggtgca 17340 

aacggacccg tggatgtttc gcgtttcagc cccccggcgc ccgcgcggtt cgaggaagta 17400 

cggcgccgcc agcgcgctac tgcccgaata tgccctacat ccttccattg cgcctacccc 17460 

cggctatcgt ggctacacct accgccccag aagacgagca actacccgac gccgaaccac 17520 

cactggaacc cgccgccgcc gtcgccgtcg ccagcccgtg ctggccccga tttccgtgcg 17580 

cagggtggct cgcgaaggag gcaggaccct ggtgctgcca acagcgcgct accaccccag 17640 

catcgtttaa aagccggtct ttgtggttct tgcagatatg gccctcacct gccgcctccg 17700 

tttcccggtg ccgggattcc gaggaagaat gcaccgtagg aggggcatgg ccggccacgg 17760 

cctgacgggc ggcatgcgtc gtgcgcacca ccggcggcgg cgcgcgtcgc accgtcgcat 17820 

gcgcggcggt atcctgcccc tccttattcc actgatcgcc gcggcgattg gcgccgtgcc 17880 

cggaattgca tccgtggcct tgcaggcgca gagacactga ttaaaaacaa gttgcatgtg 17940 

gaaaaatcaa aataaaaagt ctggactctc acgctcgctt ggtcctgtaa ctattttgta 18000 

gaatggaaga catcaacttt gcgtctctgg ccccgcgaca cggctcgcgc ccgttcatgg 18060 

gaaactggca agatatcggc accagcaata tgagcggtgg cgccttcagc tggggctcgc 18120 

tgtggagcgg cattaaaaat ttcggttcca ccgttaagaa ctatggcagc aaggcctgga 18180 

acagcagcac aggccagatg ctgagggata agttgaaaga gcaaaatttc caacaaaagg 18240 



53/64 



WO 03/031588 



PCT/US02/32512 



tggtagatgg cc tggcctct ggcattagcg gggtggtgga cctggccaac caggcagtgc 18300 

aaaataagat taacagtaag cttgatcccc gccctcccgt agaggagcct ccaccggccg L 18360 

tggagacagt gtctccagag gggcgtggcg aaaagcgtcc gcgccccgac agggaagaaa 18420 

ctctggtgac gcaaatagac gagcctccct cgtacgagga ggcactaaag caaggcctgc 18480 

ccaccacccg tcccatcgcg cccatggcta ccggagtgct gggccagcac acacccgtaa 18540 

cgctggacct gcctcccccc gccgacaccc agcagaaacc tgtgctgcca ggcccgaccg 18600 

ccgttgttgt aacccgtcct agccgcgcgt ccctgcgccg cgccgccagc ggtccgcgat 18660 

cgttgcggcc cgtagccagt ggcaactggc aaagcacact gaacagcatc gtgggtctgg 18720 

gggtgcaatc cctgaagcgc cgacgatgct tctgaatagc taacgtgtcg tatgtgtgtc 18780 

atgtatgcgt ccatgtcgcc gccagaggag ctgctgagcc gccgcgcgcc cgctttccaa 18840 

gatggctacc ccttcgatga tgccgcagtg gtcttacatg cacatctcgg gccaggacgc 18900 

ctcggagtac ctgagccccg ggctggtgca gtttgcccgc gccaccgaga cgtacttcag 18960 

cctgaataac aagtttagaa accccacggt ggcgcctacg cacgacgtga ccacagaccg 19020 

gtcccagcgt ttgacgctgc ggttcatccc tgtggaccgt gaggatactg cgtactcgta 19080 

caaggcgcgg ttcaccctag ctgtgggtga taaccgtgtg ctggacatgg cttccacgta 19140 

ctttgacatc cgcggcgtgc tggacagggg ccctactttt aagccctact ctggcactgc 19200 

ctacaacgcc ctggctccca agggtgcccc aaatccttgc gaatgggatg aagctgctac 19260 

tgctcttgaa ataaacctag aagaagagga cgatgacaac gaagacgaag tagacgagca 19320 

agctgagcag caaaaaactc acgtatttgg gcaggcgcct tattctggta taaatattac 19380 

aaaggagggt attcaaatag gtgtcgaagg tcaaacacct aaatatgccg ataaaacatt 19440 

tcaacctgaa cctcaaatag gagaatctca gtggtacgaa actgaaatta atcatgcagc 19500 

tgggagagtc cttaaaaaga ctaccccaat gaaaccatgt tacggttcat atgcaaaacc 19560 

cacaaatgaa aatggagggc aaggcattct tgtaaagcaa caaaatggaa agctagaaag 19620 

tcaagtggaa atgcaatttt tctcaactac tgaggcgacc gcaggcaatg gtgataactt 19680 

gactcctaaa gtggtattgt acagtgaaga tgtagatata gaaaccccag acactcatat 19740 

ttcttacatg cccactatta aggaaggtaa ctcacgagaa ctaatgggcc aacaatctat 19800 

gcccaacagg cctaattaca ttgcttttag ggacaatttt attggtctaa tgtattacaa 19860 

cagcacgggt aatatgggtg ttctggcggg ccaagcatcg cagttgaatg ctgttg-taga 19920 

tttgcaagac agaaacacag agctttcata ccagcttttg cttgattcca ttggtgatag 19980 

aaccaggtac ttttctatgt ggaatcaggc tgttgacagc tatgatccag atgttagaat 20040 

tattgaaaat catggaactg aagatgaact tccaaattac tgctttccac tgggaggtgt 20100 

gattaataca gagactctta ccaaggtaaa acctaaaaca ggtcaggaaa atggatggga 2 0160 

aaaagatgct acagaatttt cagataaaaa tgaaataaga gttggaaata attttgccat 20220 

ggaaatcaat ctaaatgcca acctgtggag aaatttcctg tactccaaca tagcgctgta 20280 

tttgcccgac aagctaaagt acagtccttc caacgtaaaa atttctgata acccaaacac 20340 

ctacgactac atgaacaa'gc gagtggtggc tcccgggtta gtggactgct acattaacct 20400 

tggagcacgc tggtcccttg actatatgga caacgtcaac ccatttaacc accaccgcaa 20460 

tgctggcctg cgctaccgct caatgfctgct gggcaatggt cgctatgtgc ccttccacat 20520 

ccaggtgcct cagaagttct ttgccattaa aaacctcctt ctcctgccgg gctcatacac 20580 

ctacgagtgg aacttcagga aggatgttaa catggttctg cagagctccc taggaaatga 20640 

cctaagggtt gacggagcca gcattaagtt tgatagcatt tgcctttacg ccaccttctt 20700 

ccccatggcc cacaacaccg cctccacgct tgaggccatg cttagaaacg acaccaacga 20760 

©cagtccttt aacgactatc tctccgccgc caacatgctc taccctatac ccgccaacgc 20820 

taccaacgtg cccatatcca tcccctcccg caactgggcg gctttccgcg gctgggcctt 20880 

cacgcgcctt aagactaagg aaaccccatc actgggctcg ggctacgacc cttattacac 20940 

ctactctggc tctataccct acctagatgg aaccttttac ctcaaccaca cctttaagaa 21000 

ggtggccatt acctttgact cttctgtcag ctggcctggc aatgaccgcc tgcttacccc 21060 

caacgagttt gaaattaagc gctcagttga cggggagggt tacaacgttg cccagtgtaa 21120 

catgaccaaa gactggttcc tggtacaaat gctagctaac tacaacattg gctaccaggg 21180 

cttctatatc ccagagagct acaaggaccg catgtactcc ttctttagaa acttccagcc 21240 

catgagccgt caggtggtgg atgatactaa atacaaggac taccaacagg tgggcatcct 21300 

acaccaacac aacaactctg gatttgttgg ctaccttgcc cccaccatgc gcgaaggaca 21360 

ggcctaccct gctaacttcc cctatccgct tataggcaag accgcagttg acagcattac 21420 

ccagaaaaag tttctttgcg atcgcaccct ttggcgcatc ccattctcca gtaactttat 21480 

gtccatgggc gcactcacag acctgggcca aaaccttctc tacgccaact ccgcccacgc 21540 



54/64 



WO 03/031588 



PCT/US02/32512 



gctagacatg acttttgagg tggatcccat ggacgagccc acccttcttt atgttttgtt 21600 

tgaagtcttt gacgtggtcc gtgtgcaccg gccgcaccgc ggcgtcatcg aaaccgtgta 21660 

cctgcgcacg cccttctcgg ccggcaacgc cacaacataa agaagcaagc aacatcaaca 21720 

acagctgccg ccatgggctc cagtgagcag gaactgaaag ccattgtcaa agatcttggt 21780 

tgtgggccat attttttggg cacctatgac aagcgctttc caggctttgt ttctccacac 21840 

aagctcgcct gcgccatagt caatacggcc ggtcgcgaga ctgggggcgt acactggatg 21900 

gcctttgcct ggaacccgca ctcaaaaaca tgctacctct ttgagccctt tggcttttct 21960 

gaccagcgac tcaagcaggt ttaccagttt gagtacgagt cactcctgcg ccgtagcgcc 22 020 

attgcttctt cccccgaccg ctgtataacg ctggaaaagt ccacccaaag cgtacagggg 22080 

cccaactcgg ccgcctgtgg actattctgc tgcatgtttc tccacgcctt tgccaactgg 22140 

ccccaaactc ccatggatca caaccccacc atgaacctta ttaccggggt acccaactcc 22200 

atgctcaaca gtccccaggt acagcccacc ctgcgtcgca accaggaaca gctctacagc 22260 

ttcctggagc gccactcgcc ctacttccgc agccacagtg cgcagattag gagcgccact 22320 

tctttttgtc acttgaaaaa catgtaaaaa taatgtacta gagacacttt caataaaggc 22380 

aaatgctttt atttgtacac tctcgggtga ttatttaccc ccacccttgc cgtctgcgcc 22440 

gtttaaaaat caaaggggtt ctgccgcgca tcgctatgcg ccactggcag ggacacgttg 22500 

cgatactggt gtttagtgct ccacttaaac tcaggcacaa ccatccgcgg cagctcggtg 22560 

aagttttcac tccacaggct gcgcaccatc accaacgcgt ttagcaggtc gggcgccgat 22620 

atcttgaagt cgcagttggg gcctccgccc tgcgcgcgcg agttgcgata cacagggttg 22680 

cagcactgga acactatcag cgccgggtgg tgcacgctgg ccagcacgct cttgtcggag 22740 

atcagatccg cgtccaggtc ctccgcgttg ctcagggcga acggagtcaa ctttggtagc 22800 

tgccttccca aaaagggcgc gtgcccaggc tttgagttgc actcgcaccg tagtggcatc 22860 

aaaaggtgac cgtgcccggt ctgggcgtta ggatacagcg cctgcataaa agccttgatc 22920 

tgcttaaaag ccacctgagc ctttgcgcct tcagagaaga acatgccgca agacttgccg 22980 

gaaaactgat tggccggaca ggccgcgtcg tgcacgcagc accttgcgtc ggtgttggag 23 040 

atctgcacca catttcggcc ccaccggttc ttcacgatct tggccttgct agactgctcc 23100 

ttcagcgcgc gctgcccgtt ttcgctcgtc acatccattt caatcacgtg ctccttattt 23160 

atcataatgc ttccgtgtag acacttaagc tcgccttcga tctcagcgca gcggtgcagc 23220 

cacaacgcgc agcccgtggg ctcgtgatgc ttgtaggtca cctctgcaaa cgactgcagg 23280 

tacgcctgca ggaatcgccc catcatcgtc acaaaggtct tgttgctggt gaaggtcagc 23340 

tgcaacccgc ggtgctcctc gttcagccag gtcttgcata cggccgccag agcttccact 23400 

tggtcaggca gtagtttgaa gttcgccttt agatcgttat ccacgtggta cttgtccatc 23460 

agcgcgcgcg cagcctccat gcccttctcc cacgcagaca cgatcggcac actcagcggg 23520 

ttcatcaccg taatttcact ttccgcttcg ctgggctctt cctcttcctc ttgcgtccgc 23580 

ataccacgcg ccactgggtc gtcttcattc agccgccgca ctgtgcgctt acctcctttg 23640 

ccatgcttga ttagcaccgg tgggttgctg aaacccacca tttgtagcgc cacatcttct 23700 

ctttcttcct cgctgtccac gattacctct ggtgatggcg ggcgctcggg cttgggagaa 23760 

gggcgcttct ttttcttctt gggcgcaatg gccaaatccg ccgccgaggt cgatggccgc 23820 

gggctgggtg tgcgcggcac cagcgcgtct tgtgatgagt cttcctcgtc ctcggactcg 23880 

atacgccgcc tcatccgctt ttttgggggc gcccggggag gcggcggcga cggggacggg 23 940 

gacgacacgt cctccatggt tgggggacgt cgcgccgcac cgcgtccgcg ctcgggggtg 24000 

gtttcgcgct gctcctcttc ccgactggcc atttccttct cctataggca gaaaaagatc 24060 

a£ggagtcag tcgagaagaa ggacagccta accgccccct ctgagttcgc caccaccgcc 24120 

tccaccgatg ccgccaacgc gcctaccacc ttccccgtcg aggcaccccc gcttgaggag 24180 

gaggaagtga ttatcgagca ggacccaggt tttgtaagcg aagacgacga ggaccgctca 24240 

gtaccaacag aggataaaaa gcaagaccag gacaacgcag aggcaaacga ggaacaagtc 24300 

gggcgggggg acgaaaggca tggcgactac ctagatgtgg gagacgacgt gctgttgaag 24360 

catctgcagc gccagtgcgc cattatctgc gacgcgttgc aagagcgcag cgatgtgccc 24420 

ctcgccatag cggatgtcag ccttgcctac gaacgccacc tattctcacc gcgcgtaccc 24480 

cccaaacgcc aagaaaacgg cacatgcgag cccaacccgc gcctcaactt ctaccccgta 24540 

tttgccgtgc cagaggtgct tgccacctat cacatctttt tccaaaactg caagataccc 24600 

ctatcctgcc gtgccaaccg cagccgagcg gacaagcagc tggccttgcg gcagggcgct 24660 

gtcatacctg atatcgcctc gctcaacgaa gtgccaaaaa tctttgaggg tcttggacgc 24720 

gacgagaagc gcgcggcaaa cgctctgcaa caggaaaaca gcgaaaatga aagtcactct 24780 

ggagtgttgg tggaactcga gggtgacaac gcgcgcctag ccgtactaaa acgcagcatc 24840 
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gaggtcaccc 
atgagtgagc 
caaacagagg 
cgcgagcctg 
gtggagcttg 
gaaacattgc 
gtggagctct 
aacgtgcttc 
tacttatttc 
gagtgcaacc 
gccttcaacg 
cttaaaaccc 
aggaacttta 
gactttgtgc 
ctgcagctag 
ggtctactgg 
aattcgcagc 
cctgacgaaa 
taccttcgca 
caatcccgcc 
ggccaattgc 
gtttacttgg 
tatcagcagc 
gccgccgcca 
cgaggaggag 
cgaagaggtg 
gaaatcggca 
gcccgttcgc 
gcagccgccg 
gcacaagaac 
ccgctttctt 
tcatctctac 
cacagaagca 
cggcagcagc 
agcttagaaa 
aacaagagct 
acaaaagcga 
actgcgcgct 
tacgtcatct 
aggaaattcc 
ctgcccaaga 
gggtcaacgg 
ccacacctcg 
gtcccgctcc 
actcaggggc 
taactcacct 
cgcttggtct 
cgcctcgtca 
ttggaactct 
gacctcccgg 
cggacggcta 
tccactgtcg 
tgcccgagga 
ttgcccgtag 
gaccctgtgt 



actttgccta 
tgatcgtgcg 
agggcctacc 
ccgacttgga 
agtgcatgca 
actacacctt 
gcaacctggt 
attccacgct 
tatgctacac 
tcaaggagct 
agcgctccgt 
tgcaacaggg 
tcctagagcg 
ccattaagta 
ccaactacct 
agtgtcactg 
tgcttaacga 
agtccgcggc 
aatttgtacc 
cgccaaatgc 
aagccatcaa 
acccccagtc 
agccgcgggc 
cccacggacg 
gaggacatga 
tcagacgaaa 
accggttcca 
cgacccaacc 
ccgttagccc 
gccatagttg 
ctctaccatc 
agcccatact 
aaggcgaccg 
aggaggagga 
caggattttt 
gaaaataaaa 
agatcagctt 
gactcttaag 
ccagcggcca 
cacgccctac 
ctactcaacc 
aatccgcgcc 
taataacctt 
caccactgtg 
gcagcttgcg 
gacaatcaga 
ccgtccggac 
ggcaatccta 
gcaatttatt 
ccactatccg 
cgactgaatg 
ccgccacaag 
tcatatcgag 
cctgattcgg 
tctcactgtg 



cccggcactt 
ccgtgcgcag 
cgcagttggc 
ggagcgacgc 
gcggttcttt 
tcgacagggc 
ctcctacctt 
caagggcgag 
ctggcagacg 
gcagaaactg 
ggccgcgcac 
tctgccagac 
ctcaggaatc 
ccgcgaatgc 
tgcctaccac 
tcgctgcaac 
aagtcaaatt 
tccggggttg 
tgaggactac 
ggagcttacc 
caaagcccgc 
cggcgaggag 
ccttgcttcc 
aggaggaata 
tggaagactg 
caccgtcacc 
gcatggctac 
gtagatggga 
aagagcaaca 
cttgcttgca 
acggcgtggc 
gcaccggcgg 
gatagcaaga 
gcgctgcgtc 
cccactctgt 
aacaggtctc 
cggcgcacgc 
gactagtttc 
cacccggcgc 
atgtggagtt 
cgaataaact 
caccgaaacc 
aatccccgta 
gtacttccca 
ggcggctttc 
gggcgaggta 
gggacatttc 
actctgcaga 
gaggagtttg 
gatcaattta 
ttaagtggag 
tgctttgccc 
ggcccggcgc 
gagtttaccc 
atttgcaact 



aacctacccc 
cccctggaga 
gacgagcagc 
aaactaatga 
gctgacccgg 
tacgtacgcc 
ggaattttgc 
gcgcgccgcg 
gccatgggcg 
ctaaagcaaa 
ctggcggaca 
ttcaccagtc 
ttgcccgcca 
cctccgccgc 
tctgacataa 
ctatgcaccc 
atcggtacct 
aaactcactc 
cacgcccacg 
gcctgcgtca 
caagagtttc 
ctcaacccaa 
caggatggca 
ctgggacagt 
ggagagccta 
ctcggtcgca 
aacctccgct 
caccactgga 
acagcgccaa 
agactgtggg 
cttcccccgt 
cagcggcagc 
ctctgacaaa 
tggcgcccaa 
atgctatatt 
tgcgatccct 
tggaagacgc 
gcgccctttc 
cagcacctgt 
accagccaca 
acatgagcgc 
gaattctctt 
gttggcccgc 
gagacgccca 
gtcacagggt 
ttcagctcaa 
agatcggcgg 
cctcgtcctc 
tgccatcggt 
ttcctaactt 
aggcagagca 
gcgactccgg 
acggcgtccg 
agcgccccct 
gtcctaacct 



ccaaggtcat 
gggatgcaaa 
tagcgcgctg 
tggccgcagt 
agatgcagcg 
aggcctgcaa 
acgaaaaccg 
actacgtccg 
tttggcagca 
acttgaagga 
tcattttccc 
aaagcatgtt 
cctgctgtgc 
tttggggcca 
tggaagacgt 
cgcaccgctc 
ttgagctgca 
cggggctgtg 
agattaggtt 
ttacccaggg 
tgctacgaaa 
tccccccgcc 
cccaaaaaga 
caggcagagg 
gacgaggaag 
ttcccctcgc 
cctcaggcgc 
accagggccg 
ggctaccgct 
ggcaacatct 
aacatcctgc 
ggcagcaaca 
gcccaagaaa 
cgaacccgta 
tcaacagagc 
cacccgcagc 
ggaggctctc 
tcaaatttaa 
cgtcagcgcc 
aatgggactt 
gggaccccac 
ggaacaggcg 
tgccctggtg 
ggccgaagtt 
gcggtcgccc 
cgacgagtcg 
cgccggccgt 
tgagccgcgc 
ctactttaac 
tgacgcggta 
actgcgcctg 
tgagttttgc 
gcttaccgcc 
gctagttgag 
tggattacat 



gagcacagtc 
tttgcaagaa 
gcttcaaacg 
gctcgttacc 
caagctagag 
gatctccaac 
ccttgggcaa 
cgactgcgtt 
gtgcttggag 
cctatggacg 
cgaacgcctg 
gcagaacttt 
acttcctagc 
ctgctacctt 
gagcggtgac 
cctggtttgc 
gggtccctcg 
gacgtcggct 
ctacgaagac 
ccacattctt 
gggacggggg 
gccgcagccc 
agctgcagct 
aggttttgga 
cttccgaggt 
cggcgcccca 
cgccggcact 
gtaagtccaa 
catggcgcgg 
ccttcgcccg 
attactaccg 
gcagcggcca 
tccacagcgg 
tcgacccgcg 
aggggccaag 
tgcctgtatc 
ttcagtaaat 
gcgcgaaaac 
attatgagca 
gcggctggag 
atgatatccc 
gctattacca 
taccaggaaa 
cagatgacta 
gggcagggta 
gtgagctcct 
ccttcattca 
tctggaggca 
cccttctcgg 
aaggactcgg 
aaacacctgg 
tactttgaat 
cagggagagc 
cgggacaggg 
caagatcttt 



24900 
24960 
25020 
25080 
25140 
25200 
25260 
25320 
25380 
25440 
25500 
25560 
25620 
25680 
25740 
25800 
25860 
25920 
25980 
26040 
26100 
26160 
26220 
26280 
26340 
26400 
26460 
26520 
26580 
26640 
26700 
26760 
26820 
26880 
26940 
27000 
27060 
27120 
27180 
27240 
27300 
27360 
27420 
27480 
27540 
27600 
27660 
27720 
27780 
27840 
27900 
27960 
28020 
28080 
28140 
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gttgccatct ctgtgctgag tataataaat acagaaatta aaatatactg gggctcctat 28200 

cgccatcctg taaacgccac cgtcttcacc cgcccaagca aaccaaggcg aaccttacct 28260 

ggtactttta acatctctcc ctctgtgatt tacaacagtt tcaacccaga cggagtgagt 28320 

ctacgagaga acctctccga gctcagctac tccatcagaa aaaacaccac cctccttacc 28380 

tgccgggaac gtacgagtgc gtcaccggcc gctgcaccac acctaccgcc tgaccgtaaa 28440 

ccagactttt tccggacaga cctcaataac tctgtttacc agaacaggag gtgagcttag 285G0 

aaaaccctta gggtattagg ccaaaggcgc agctactgtg gggtttatga acaattcaag 28560 

caactctacg ggctattcta attcaggttt ctctagaatc ggggttgggg ttattctctg 28620 

tcttgtgatt ctctttattc ttatactaac gcttctctgc ctaaggctcg ccgcctgctg 28680 

tgtgcacatt tzgcatttatt gtcagctttt taaacgctgg ggtcgccacc caagatgatt 28740 

aggtacataa tcctaggttt actcaccctt gcgtcagccc acggtaccac ccaaaaggtg 28800 

gattttaagg agccagcctg taatgttaca ttcgcagctg aagctaatga gtgcaccact 28860 

cttataaaat gcaccacaga acatgaaaag ctgcttattc gccacaaaaa caaaattggc 28920 

aagtatgctg tttatgctat ttggcagcca ggtgacacta cagagtataa tgttacagtt 28980 

ttccagggta aaagtcataa aacttttatg tatacttttc cattttatga aatgtgcgac 29040 

attaccatgt acatgagcaa acagtataag ttgtggcccc cacaaaattg tgtggaaaac 29100 

actggcactt tctgctgcac tgctatgcta attacagtgc tcgctttggt ctgtacccta 29160 

ctctatatta aatacaaaag cagacgcagc tttattgagg aaaagaaaat gccttaattt 29220 

actaagttac aaagctaatg tcaccactaa ctgctttact cgctgcttgc aaaacaaatt 29280 

caaaaagtta gcattataat tagaatagga tttaaacccc ccggtcattt cctgctcaat 29340 

accattcccc tgaacaattg actctatgtg ggatatgctc cagcgctaca accttgaagt 29400 

caggcttcct ggatgtcagc atctgacttt ggccagcacc tgtcccgcgg atttgttcca 29460 

gtccaactac agcgacccac cctaacagag atgaccaaca caaccaacgc ggccgccgct 29520 

accggactta catctaccac aaatacaccc caagtttctg cctttgtcaa taactgggat 29580 

aacttgggca tgtggtggtt ctccatagcg cttatgtttg tatgccttat tattatgtgg 29640 

ctcatctgct gcctaaagcg caaacgcgcc cgaccaccca tctatagtcc catcattgtg 29700 

ctacacccaa acaatgatgg aatccataga ttggacggac tgaaacacat gttcttttct 29760 

cttacagtat gattaaatga gacatgattc ctcgagtttt tatattactg acccttgttg 29820 

cgcttttttg tgcgtgctcc acattggctg cggtttctca catcgaagta gactgcattc 29880 

cagccttcac agtctatttg ctttacggat ttgtcaccct cacgctcatc tgcagcctca 29940 

tcactgtggt catcgccttt atccagtgca ttgactgggt ctgtgtgcgc tttgcatatc 30000 

tcagacacca tccccagtac agggacagga ctatagctga gcttcttaga attctttaat 30060 

tatgaaattt actgtgactt ttctgctgat tatttgcacc ctatctgcgt tttgttcccc 3012 0 

gacctccaag cctcaaagac atatatcatg cagattcact cgtatatgga atattccaag 30180 

ttgctacaat gaaaaaagcg atctttccga agcctggtta tatgcaatca tctctgttat 30240 

ggtgttctgc agtaccatct tagccctagc tatatatccc taccttgaca ttggctggaa 30300 

acgaatagat gccatgaacc acccaacttt ccccgcgccc gctatgcttc cactgcaaca 30360 

agttgttgcc ggcggctttg tcccagccaa tcagcctcgc cccacttctc ccacccccac 30420 

tgaaatcagc tactttaatc taacaggagg agatgactga caccctagat ctagaaatgg 30480 

acggaattat tacagagcag cgcctgctag aaagacgcag ggcagcggcc gagcaacagc 30540 

gcatgaatca agagctccaa gacatggtta acttgcacca gtgcaaaagg ggtatctttt 30600 

gtctggtaaa gcaggccaaa gtcacctacg acagtaatac caccggacac cgccttagct 30660 

acaagttgcc aaccaagcgt cagaaattgg tggtcatggt gggagaaaag cccattacca 30720 

taactcagca ctcggtagaa accgaaggct gcattcactc accttgtcaa ggacctgagg 30780 

atctctgcac ccttattaag accctgtgcg gtctcaaaga tcttattccc tttaactaat 30840 
aaaaaaaaat aataaagcat cacttactta aaatcagtta gcaaatttct gtccagttta 30900 
ttcagcagca cctccttgcc ctcctcccag ctctggtatt gcagcttcct cctggctgca 30960 
aactttctcc acaatctaaa tggaatgtca gtttcctcct gttcctgtcc atccgcaccc 31020 
actatcttca tgttgttgca gatgaagcgc gcaagaccgt ctgaagatac cttcaacccc 31080 
gtgtatccat atgacacgga aaccggtcct ccaactgtgc cttttcttac tcctcccttt 31140 
gtatccccca atgggtttca agagagtccc cctggggtac tctctttgcg cctatccgaa 31200 
cctctagtta cctccaatgg catgcttgcg ctcaaaatgg gcaacggcct ctctctggac 31260 
gaggccggca accttacctc ccaaaatgta accactgtga gcccacctct caaaaaaacc 31320 
aagtcaaaca taaacctgga aatatctgca cccctcacag ttacctcaga agccctaact 31380 
gtggctgccg ccgcacctct aatggtcgcg ggcaacacac tcaccatgca atcacaggcc 31440 
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ccgctaaccg 
gaaggaaagc 
actatcactg 
gagcccattt 
acagacgacc 
tccttgcaaa 
aatgtagcag 
tatccgtttg 
aactcagccc 
aacaattcca 
acagccatag 
acaaatcccc 
gttcctaaac 
aaaaataatg 
aatgcagaga 
gctacagttt 
agtgctcatc 
gacccagaat 
gctgttggat 
agtaacattg 
attacactaa 
ttttcatggg 
act ttt teat 
ttttcaattg 
tagcttatac 
acctccctcc 
catatcatgg 
caaacgctca 
gtccagctgc 
agaagtccac 
ctgcagcagc 
ggcagtggtc 
ggcacagcag 
aatattgttc 
agaacccacg 
cacgctggac 
tataaacctc 
ctgcccgccg 
ggactcgtaa 
caegtgeata 
aacaacccat 
cacgttgtgc 
afgegegggtt 
caaccgagat 
tcctgaagca 
tagatcgetc 
tggcttcggg 
cagaataagc 
egggaagage 
aaatgaagat 
aaagaacaga 
ctcacgtcca 
ccagcacctt 
agcaaatccc 
ttcagcctca 



tgcacgactc 
tagccctgca 
cctcaccccc 
atacacaaaa 
taaacacttt 
ctaaagttac 
gaggactaag 
atgctcaaaa 
acaacttgga 
aaaagcttga 
ecattaatge 
tcaaaacaaa 
taggaactgg 
ataagctaac 
aagatgctaa 
cagttttggc 
ttattataag 
attggaactt 
ttatgcctaa 
tcagtcaagt 
aeggtacaca 
actggtctgg 
acattgccca 
cagaaaattt 
agatcaccgt 
caacacacag 
gtaacagaca 
tcagtgatat 
tgagccacag 
gectacatgg 
gcgcgaataa 
tcctcagcga 
cgcaccctga 
aaaatcccac 
tggecatcat 
ataaacatta 
tgattaaaca 
gctatacact 
ccatggatca 
cacttcctca 
tcctgaatca 
attgtcaaag 
tctgtctcaa 
cgtgttggtc 
aaaccaggtg 
tgtgtagtag 
ttctatgtaa 
cacacccagc 
tggaagaacc 
ctattaagtg 
taatggcatt 
agtggacgta 
caaccatgcc 
gaatattaag 
ageagegaat 



caaacttagc 
aacatcaggc 
tctaactact 
tggaaaacta 
gaeegtagea 
tggagccttg 
gattgattct 
ccaactaaat 
tattaactac 
ggttaaccta 
aggagatggg 
aattggccat 
ccttagtttt 
tttgtggacc 
actcactttg 
tgttaaaggc 
atttgacgaa 
tagaaatgga 
cctatcagct 
ttacttaaac 
ggaaacagga 
ccacaactac 
agaataaaga 
caagtcattt 
accttaatca 
agtacacagt 
tattcttagg 
taataaactc 
gctgctgtcc 
gggtagagtc 
actgctgccg 
tgattegcac 
tctcacttaa 
agtgcaaggc 
accacaagcg 
cctcttttgg 
tggcgccatc 
gcagggaacc 
teatgetegt 
ggattacaag 
gegtaaatec 
tgttacattc 
aaggaggtag 
gtagtgtcat 
egggegtgae 
ttgtagtata 
actccttcat 
caacctacac 
atgttttttt 
aacgcgctcc 
tgtaagatgt 
aaggctaaac 
caaataattc 
tccggccatt 
catgattgea 



attgccaccc 
cccctcacca 
gccactggta 
ggactaaagt 
actggtccag 
ggttttgatt 
caaaacagac 
ctaagactag 
aacaaaggee 
agcactgcca 
cttgaatttg 
ggcctagaat 
gacagcacag 
acaccagctc 
gtcttaacaa 
agtttggctc 
aatggagtgc 
gatcttactg 
tatccaaaat 
ggagacaaaa 
gacacaactc 
attaatgaaa 
atcgtttgtg 
ttcattcagt 
aactcacaga 
cctttctccc 
tgttatattc 
cccgggcagc 
aacttgeggt 
ataatcgtgc 
ccgccgctcc 
cgcccgcagc 
atcagcacag 
gctgtatcca 
caggtagatt 
catgttgtaa 
caccaccatc 
gggactggaa 
catgatatca 
ctcctcccgc 
cacactgcag 
gggcagcagc 
acgatcccta 
gccaaatgga 
aaac agate t 
tccactctct 
gcgccgctgc 
attcgttctg 
ttttattcca 
cctccggtgg 
tgcacaatgg 
ccttcagggt 
tcatctcgcc 
gtaaaaatct 
aaaattcagg 



aaggacccct 
ccaccgatag 
gcttgggcat 
acggggctcc 
gtgtgactat 
cacaaggcaa 
gecttatact 
gacagggccc 
tttacttgtt 
aggggttgat 
gttcacctaa 
ttgattcaaa 
gtgecattae 
catctcctaa 
aatgtggcag 
caatatctgg 
tactaaacaa 
aaggcacagc 
etcaeggtaa 
ctaaacctgt 
caagtgeata 
tatttgecac 
ttatgtttca 
agtatagece 
accctagtat 
cggctggcct 
cacacggttt 
tcacttaagt 
tgcttaaegg 
atcaggatag 
gtcctgeagg 
ataaggegee 
taactgeage 
aagctcatgg 
aagtggcgac 
ttcaccacct 
ctaaaccagc 
caatgacagt 
atgttggcac 
gttagaacca 
ggaagacctc 
ggatgatcct 
ctgtacggag 
acgccggacg 
gcgtctccgg 
caaagcatcc 
cctgataaca 
cgagtcacac 
aaagattatc 
cgtggtcaaa 
cttccaaaag 
gaatctcctc 
accttctcaa 
gctccagagc 
ttcctcacag 



cacagtgtca 

cagtaccctt 

tgacttgaaa 

tttgcatgta 

taataatact 

tatgeaaett 

tgatgttagt 

tctttttata 

tacagcttca 

gtttgacget 

tgcaccaaac 

caaggctatg 

agtaggaaac 

ctgtagacta 

tcaaatactt 

aacagttcaa 

ttccttcctg 

ctatacaaac 

aactgecaaa 

aacactaacc 

ctctatgtca 

atcctcttac 

acgtgtttat 

caccaccaca 

tcaacctgcc 

taaaaagcat 

cctgtcgagc 

teatgtcget 

geggegaagg 

ggcggtggtg 

aatacaacat 

ttgtcctccg 

acagcaccac 

cggggaccac 

ccctcataaa 

cccggtacca 

tggecaaaac 

ggagagecca 

aacacaggca 

tatcccaggg 

geaegtaact 

ccagtatggt 

tgegecgaga 

tagtcatatt 

tctcgccgct 

aggcgccccc 

tccaccaccg 

aegggaggag 

caaaacctca 

ctctacagcc 

gcaaacggcc 

tataaacatt 

tatatctcta 

gccctccacc 

acctgtataa 



31500 
31560 
31620 
31680 
31740 
31800 
31860 
31920 
31980 
32040 
32100 
32160 
32220 
32280 
32340 
32400 
32460 
32520 
32580 
32640 
32700 
32760 
32820 
32880 
32940 
33000 
33060 
33120 
33180 
33240 
33300 
33360 
33420 
33480 
33540 
33600 
33660 
33720 
33780 
33840 
33900 
33960 
34020 
34080 
34140 
34200 
34260 
34320 
34380 
34440 
34500 
34560 
34620 
34680 
34740 
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gattcaaaag cggaacatta acaaaaatac cgcgatcccg taggtccctt cgcagggcca 34800 

gctgaacata atcgtgcagg tctgcacgga ccagcgcggc cacttccccg ccaggaacct 34860 

tgacaaaaga acccacactg attatgacac gcatactcgg agctatgcta accagcgtag 34920 

ccccgatgta agctttgttg catgggcggc gatataaaat gcaaggtgct gctcaaaaaa 34980 

tcaggcaaag cctcgcgcaa aaaagaaagc acatcgtagt catgctcatg cagataaagg 35040 

caggtaagct ccggaaccac cacagaaaaa gacaccattt ttctctcaaa catgtctgcg 35100 

ggtttctgca taaacacaaa ataaaataac aaaaaaacat ttaaacatta gaagcctgtc 35160 

ttacaacagg aaaaacaacc cttataagca taagacggac tacggccatg ccggcgtgac 35220 

cgtaaaaaaa ctggtcaccg tgattaaaaa gcaccaccga cagctcctcg gtcatgtccg 35280 

gagtcataat gtaagactcg gtaaacacat caggttgatt catcggtcag tgctaaaaag 35340 

cgaccgaaat agcccggggg aatacatacc cgcaggcgta gagacaacat tacagccccc 35400 

ataggaggta taacaaaatt aataggagag aaaaacacat aaacacctga aaaaccctcc 35460 

tgcctaggca aaatagcacc ctcccgctcc agaacaacat acagcgcttc acagcggcag 35520 

cctaacagtc agccttacca gtaaaaaaga aaacctatta aaaaaacacc actcgacacg 3 5580 

gcaccagctc aatcagtcac agtgtaaaaa agggccaagt gcagagcgag tatatatagg 35640 

actaaaaaat gacgtaacgg ttaaagtcca caaaaaacac ccagaaaacc gcacgcgaac 35700 

ctacgcccag aaacgaaagc caaaaaaccc acaacttcct caaatcgtca cttccgtttt 35760 

cccacgttac gtaacttccc attttaagaa aactacaatt cccaacacat acaagttact 35820 

ccgccctaaa acctacgtca cccgccccgt tcccacgccc cgcgccacgt cacaaactcc 35880 

accccctcat tatcatattg gcttcaatcc aaaataaggt atattattga tgatg 3 5935 

<210> 10 
<211> 5965 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> NSsuboptmut 
<400> 10 

gccaccatgg cccccatcac cgcctacagc cagcagacca ggggcctgct gggctgcatc 60 

atcaccagcc tgaccggacg cgacaagaac caggtggagg gagaggtgca ggtggtgagc 120 

accgctaccc agagcttcct ggccacctgc gtgaacggcg tgtgctggac cgtgtaccac 180 

ggagccggaa gcaagaccct ggccggaccc aagggcccta tcacccagat gtacaccaat 240 

gtggatcagg atctggtggg ctggcaggcc cctcccggag ccaggagcct gacaccctgt 300 

acctgtggaa gcagcgacct gtacctggtg acacgccacg ccgatgtgat ccccgtgagg 360 

cgcaggggcg attctcgcgg aagcctgctg agccctaggc ccgtgagcta cctgaagggc 420 

agcagcggag gacccctgct gtgtccttct ggccatgccg tgggcatttt tcgcgctgcc 480 

gtgtgtacca ggggcgtggc caaagccgtg gattttgtgc ccgtggaaag catggagacc 540 

accatgcgca gccctgtgtt caccgacaac agctctcccc ctgccgtgcc ccaatcattc 600 

caggtggctc acctgcacgc ccctaccgga tctggcaaga gcaccaaggt gcccgctgcc 660 

tacgccgctc agggctacaa ggtgctggtg ctgaacccca gcgtggccgc taccctgggc 720 

fcfccggcgctt acatgagcaa ggcccatggc atcgacccca acatccgcac aggcgtgcgc 780 

accatcacca ccggagctcc cgtgacctac agcacctacg gcaagttcct ggccgatgga 840 

ggctgcagcg gaggagccta cgacatcatc atctgcgacg agtgccacag caccgacagc 900 

accaccatcc tgggcattgg caccgtgctg gatcaggccg aaacagctgg agccaggctg 960 

gtggtgctgg ccacagctac ccctcctggc agcgtgaccg tgccccatcc caatatcgag 1020 

gaggtggccc tgagcaacac aggcgagatc cccttctacg gcaaggccat ccccatcgag 1080 

gccatccgcg gaggcaggca cctgatcttc tgccacagca agaagaagtg cgacgagctg 1140 

gctgccaagc tgagcggact gggcatcaac gccgtggcct actacagggg cctggacgtg 12 00 

tcagtgatcc ccaccatcgg cgatgtggtg gtggtggcca ccgacgccct gatgacaggc 1260 

tacaccggag acttcgacag cgtgatcgac tgcaacacct gcgtgaccca gaccgtggac 1320 

ttcagcctgg accccacctt caccatcgaa accaccaccg tgcctcagga tgctgtgagc 1380 

aggagccaga ggcgcggacg caccggaagg ggcaggcgcg gaatttatcg ctttgtgacc 1440 

cctggcgaaa ggccctctgg catgttcgac agcagcgtgc tgtgcgagtg ctacgacgct 1500 
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ggctgcgctt 

aatacccctg 

ggactgaccc 

ccctatctgg 

tgggaccaga 

ctgctgtacc 

tacatcatgg 

ggaggcgtgc 

ggacgcatca 

gagttcgacg 

ctggccgaac 

gaagcfcgccg 

cacatgtgga 

aaccccgcta 

cagagcaccc 

tcagctgctt 

ctgggcaaag 

gtggccttca 

cctgccattc 

cgccatgtgg 

tctcgcggaa 

gtgacccaga 

atcaacgagg 

atctgcaccg 

cctggcgtgc 

atcatgcaga 

atgcgcatcg 

gcctacacca 

agggtggctg 

ggaatgacca 

gaagtggatg 

gaagtgacct 

cctgagcccg 

gaaaccgcta 

agccagctgt 

gccgacctga 

gtggagagcg 

gacgagcgcg 

gccatgccca 

cctgattacg 

attccacctc 

ctggccgaac 

dcagccaccg 

agctatagca 

agctggagca 

acctggacag 

gccctgagca 

gccggactga 

cgcgatgtgc 

gtggaggagg 

gccaaggacg 

gacctgctgg 

ttctgcgtgc 

ctgggcgtgc 

gtggtgatgg 



ggtacgagct 

gcctgcccgt 

acatcgacgc 

tggcctatca 

tgtggaagtg 

gcctgggagc 

cctgcatgag 

tggccgctct 

tcctgagcgg 

agatggagga 

agttcaagca 

ctcccgtggt 

acttcatctc 

tcgccagcct 

tgctgttcaa 

ctgcctttgt 

tgctggtgga 

aggtgatgag 

tgagccctgg 

gacccggaga 

accacgtgag 

tcctgagcag 

actgcagcac 

tgctgaccga 

ccttcttctc 

ccacctgtcc 

tgggccctaa 

ccggaccctg 

ccgaggagta 

ccgacaacgt 

gcgtgcgcct 

tccaggtggg 

atgtggccgt 

aaaggcgcct 

ctgctcccag 

tcgaggccaa 

agaacaaggt 

aggtgagcgt 

tctgggctag 

tgcctccagt 

ctaggcgcaa 

tggccaccaa 

ctctgcctga 

gcatgcctcc 

ccgtgagcga 

gcgctctgat 

acagcctgct 

ggcagaagaa 

tgaaggagat 

cctgcaagct 

tgcgcaacct 

aggacaccgt 

agcccgagaa 

gcgtgtgcga 

gctcaagcta 



gacacccgct 

gtgtcaggac 

ccatttcctg 

ggccaccgtg 

cctgatccgc 

cgtgcagaac 

cgctgatctg 

ggctgcctac 

aaggcccgct 

gtgtgccagc 

gaaggccctg 

ggaaagcaag 

tggcatccag 

gatggccttc 

cattctgggc 

gggcgctggc 

tattctggct 

cggagagatg 

agccctggtg 

gggcgctgtg 

ccctacccac 

cctgaccatc 

accctgcagc 

cttcaagacc 

atgccagcgc 

ctgcggagcc 

gacctgcagc 

cacacccagc 

cgtggaggtg 

gaagtgtccc 

gcatcgctat 

cctgaaccag 

gctgaccagc 

ggccaggggc 

cctgaaggcc 

cctgctgtgg 

ggtggtgctg 

gcccgccgag 

acctgattac 

ggtgcatggc 

aaggaccgtg 

gacctttggc 

ccaggccagc 

cctggaaggc 

agaggccagc 

cacaccctgc 

gaggcaccac 

ggtgaccttc 

gaaggccaag 

gacccccccc 

gagcagcaag 

gacccccatc 

gggcggccgc 

gaagatggcc 

cggcttccag 



gaaaccagcg 

cacctggagt 

agccagacca 

tgtgctaggg 

ctgaagccca 

gaggtgaccc 

gaagtggtga 

tgcctgacca 

atcgtgcccg 

cacctgccct 

ggcctgctgc 

tggagggccc 

tacctggccg 

accgctagca 

ggatgggtgg 

attgccggag 

ggctatggcg 

cccagcaccg 

gtgggcgtgg 

cagtggatga 

tacgtgcctg 

acccagctgc 

ggaagctggc 

tggctgcaga 

ggatacaagg 

cagatcacag 

aacacctggc 

cctgctccca 

accagggtgg 

tgtcaggtgc 

gcccctgcct 

tacctggtgg 

atgctgaccg 

tctcctccaa 

acctgcacca 

cgccaggaga 

gacagcttcg 

atcctgcgca 

aaccctcccc 

tgtcctctgc 

gtgctgacag 

agcagcgaga 

gacgacggcg 

gaacctggcg 

gaggacgtgg 

gctgccgagg 

aacatggtgt 

gaccgcctgc 

gccagcaccg 

cacagcgcca 

gccgtgaacc 

gacaccacca 

aagcccgctc 

ctgtacgacg 

tacagccctg 



tgcgcctgcg 

tctgggagag 

agcaggctgg 

cccaagctcc 

ccctgcacgg 

tgacccaccc 

ccagcacctg 

ccggaagcgt 

atcgcgagtt 

acatcgagca 

agacagccac 

tggagacctt 

gactgagcac 

tcacctctcc 

ccgctcagct 

ccgctgtggg 

ctggcgtggc 

aggacctggt 

tgtgtgctgc 

accgcctgat 

agagcgacgc 

tgaagcgcct 

tgagggacgt 

gcaagctgct 

gcgtgtggag 

gccacgtgaa 

acggcacctt 

actacagcag 

gagacttcca 

ccgctcccga 

gtaggcccct 

gcagccagct 

accccagcca 

gcctggcctc 

cccaccacgt 

tgggcggcaa 

accccctgcg 

agagcaagaa 

tgctggagag 

ctcccattaa 

aaagcagcgt 

gctctgccgt 

ataagggcag 

atcccgatct 

tgtgttgcag 

agagcaagct 

acgccaccac 

aggtgctgga 

tgaaggccaa 

agagcaagtt 

acatccacag 

tcatggccaa 

gcctgatcgt 

tggtgagcac 

gccagcgcgt 



cgcttatctg 
cgtgttcaca 
cgacaacttc 
acctccttca 
ccctacccct 
catcaccaag 
ggtgctggtg 
ggtgatcgtg 
cctgtaccag 
gggcatgcag 
caaacaggcc 
ctgggctaag 
cctgcctggc 
cctgaccacc 
ggcccctcct 
cagcattggc 
cggagccctg 
gaacctgctg 
cattctgagg 
cgccttcgcc 
cgctgccagg 
gcaccagtgg 
gtgggactgg 

gccccaactg 
gggcgatggc 
gaacggcagc 
ccccatcaac 
ggccctgtgg 
ctacgtgacc 
attttttacc 
gctgcgcgaa 
gccctgcgag 
catcacagcc 
aagcagcgct 
gagccccgac 
catcacccgc 
cgccgaggag 
gttccccgct 
ctggaaggac 
agcccctcct 
gagctctgct 
ggacagcgga 
cgatgtggag 
gagcgatggc 
catgagctac 
gcccatcaac 
cagcaggtct 
cgaccactac 
gctgctgagc 
cggctacggc 
cgtgtggaag 
gaacgaggtg 
gttccccgat 
cctgcctcag 
ggagttcctg 



1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3366 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740. 

4800 
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gtgaacacct ggaagagcaa gaagaacccc atgggcttca gctacgacac acgctgcttc 4860 

gacagcaccg tgaccgagaa cgacatccgc gtggaggaga gcatctacca gtgctgcgac 492 0 

ctggcccctg aggccaggca ggccatcaag agcctgaccg agcgcctgta catcggaggc 4980 

cctctgacca acagcaaggg acagaactgc ggatacaggc gctgtagggc ctctggcgtg 5040 

ctgaccacca gctgtggcaa caccctgacc tgctacctga aggccagcgc tgcctgtcgc 5100 

gctgccaagc tgcaggactg caccatgctg gtgaacgccg ctggcctggt ggtgatttgt 5160 

gaaagcgctg gcacccagga agatgctgcc agcctgcgcg tgttcaccga ggccatgacc 5220 

aggtactctg cccctcccgg agacccccct cagcccgaat acgacctgga gctgatcacc 5280 

agctgctcaa gcaacgtgag cgtggctcac gacgccagcg gaaagcgcgt gtactacctg 5340 

acacgcgatc ccaccacccc tctggctcgc gctgcctggg aaaccgctcg ccatacaccc 5400 

gtgaacagct ggctgggcaa catcatcatg tacgccccta ccctgtgggc tcgcatgatc 5460 

ctgatgaccc acttcttcag catcctgctg gctcaggagc agctggagaa ggccctggac 5520 

tgccagattt acggcgcttg ctacagcatc gagcccctgg acctgcccca aatcatcgag 5580 

cgcctgcacg gcctgtctgc cttcagcctg cacagctaca gccctggcga aattaatcgc 5640 

gtggccagct gtctgcgcaa actgggcgtg cctcctctgc gcgtgtggag gcatagggct 5700 

aggagcgtga gggctaggct gctgagccag ggaggcaggg ccgctacctg tggaaagtac 5760 

ctgttcaact gggccgtgaa gaccaagctg aagctgaccc ctatccctgc cgctagccag 5820 

ctggacctga gcggatggtt cgtggctggc tacagcggag gcgacatcta ccacagcctg 5880 

tctcgcgctc gccctcgctg gttcatgctg tgcctgctgc tgctgagcgt gggcgtgggc 5940 

atctacctgc tgcccaaccg ctaaa 5965 

<210> 11 
<211> 5965 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Chimeric NSsuboptmut 



<400> 11 

gccaccatgg 

atcaccagcc 

accgccaccc 

ggcgccggca 

gtggaccagg 

acctgcggca 

cgccgcggcg 

agcagcggcg 

gtgtgcaccc 

accatgcgca 

caggtggccc 

tacgccgccc 

tfcggcgcct 

accatcacca 

ggctgcagcg 

accaccatcc 

gtggtgctgg 

gaggtggccc 

gccatccgcg 

gccgccaagc 

agcgtgatcc 

tacaccggcg 

ttcagcctgg 

cgcagccagc 

cccggcgagc 



cccccatcac 

tgaccggccg 

agagcttcct 

gcaagaccct 

acctggtggg 

gcagcgacct 

acagccgcgg 

gccccctgct 

gcggcgtggc 

gccccgtgtt 

acctgcacgc 

agggctacaa 

acatgagcaa 

ccggcgcccc 

gcggcgccta 

tgggcatcgg 

ccaccgccac 

tgagcaacac 

gcggccgcca 

tgagcggcct 

ccaccatcgg 

acttcgacag 

accccacctt 

gccgcggccg 

gccccagcgg 



cgcctacagc 

cgacaagaac 

ggccacctgc 

ggccggcccc 

ctggcaggcc 

gtacctggtg 

cagcctgctg 

gtgccccagc 

caaggccgtg 

caccgacaac 

ccccaccggc 

ggtgctggtg 

ggcccacggc 

cgtgacctac 

cgacatcatc 

caccgtgctg 

cccccccggc 

cggcgagatc 

cctgatcttc 

gggcatcaac 

cgacgtggtg 

cgtgatcgac 

caccatcgag 

caccggccgc 

catgttcgac 



cagcagaccc 

caggtggagg 

gtgaacggcg 

aagggcccca 

ccccccggcg 

acccgccacg 

agcccccgcc 

ggccacgccg 

gacttcgtgc 

agcagccccc 

agcggcaaga 

ctgaacccca 

atcgacccca 

agcacctacg 

atctgcgacg 

gaccaggccg 

agcgtgaccg 

cccttctacg 

tgccacagca 

gccgtggcct 

gtggtggcca 

tgcaacacct 

accaccaccg 

ggccgccgcg 

agcagcgtgc 



gcggcctgct 

gcgaggtgca 

tgtgctggac 

tcacccagat 

cccgcagcct 

ccgacgtgat 

ccgtgagcta 

tgggcatctt 

ccgtggagag 

ccgccgtgcc 

gcaccaaggt 

gcgtggccgc 

acatccgcac 

gcaagttcct 

agtgccacag 

agaccgccgg 

tgccccaccc 

gcaaggccat 

agaagaagtg 

actaccgcgg 

ccgacgccct 

gcgtgaccca 

tgccccagga 

gcatctaccg 

tgtgcgagtg 



gggctgcatc 

ggtggtgagc 

cgtgtaccac 

gtacaccaac 

gaccccctgc 

ccccgtgcgc 

cctgaagggc 

ccgcgccgcc 

catggagacc 

ccagagcttc 

gcccgccgcc 

caccctgggc 

cggcgtgcgc 

ggccgacggc 

caccgacagc 

cgcccgcctg 

caacatcgag 

ccccatcgag 

cgacgagctg 

cctggacgtg 

gatgaccggc 

gaccgtggac 

cgccgtgagc 

cttcgtgacc 

ctacgacgcc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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ggctgcgcct ggtacgagct gacccccgcc gagaccagcg tgcgcctgcg cgcctacctg 1560 

aacacccccg gcctgcccgt gtgccaggac cacctggagt tctgggagag cgtgttcacc 1620 

ggcctgaccc acatcgacgc ccacttcctg agccagacca agcaggccgg cgacaacttc 1680 

ccctacctgg tggcctacca ggccaccgtg tgcgcccgcg cccaggcccc cccccccagc 1740 

tgggaccaga tgtggaagtg cctgatccgc ctgaagccca ccctgcacgg ccccaccccc 1800 

ctgctgtacc gcctgggcgc cgtgcagaac gaggtgaccc tgacccaccc catcaccaag 1860 

tacatcatgg cctgcatgag cgccgacctg gaggtggtga ccagcacctg ggtgctggtg 1920 

ggcggcgtgc tggccgccct ggccgcctac tgcctgacca ccggcagcgt ggtgatcgtg 1980 

ggccgcatca tcctgagcgg ccgccccgcc atcgtgcccg accgcgagtt cctgtaccag 2040 

gagttcgacg agatggagga gtgcgccagc cacctgccct acatcgagca gggcatgcag 2100 

ctggccgagc agttcaagca gaaggccctg ggcctgctgc agaccgccac caagcaggcc 2160 

gaggccgccg cccccgtggt ggagagcaag tggcgcgccc tggagacctt ctgggccaag 2220 

cacatgtgga acttcatcag cggcatccag tacctggccg gcctgagcac cctgcccggc 2280 

aaccccgcca tcgccagcct gatggccttc accgccagca tcaccagccc cctgaccacc 2340 

cagagcaccc tgctgttcaa catcctgggc ggctgggtgg ccgcccagct ggcccccccc 2400 

agcgccgcca gcgccttcgt gggcgccggc atcgccggcg ccgccgtggg cagcatcggc 2460 

ctgggcaagg tgctggtgga catcctggcc ggctacggcg ccggcgtggc cggcgccctg 2520 

gtggccttca aggtgatgag cggcgagatg cccagcaccg aggacctggt gaacctgctg 2580 

cccgccatcc tgagccccgg cgccctggtg gtgggcgtgg tgtgcgccgc catcctgcgc 2640 

cgccacgtgg gccccggcga gggcgccgtg cagtggatga accgcctgat cgccttcgcc 2700 

agccgcggca accacgtgag ccccacccac tacgtgcccg agagcgacgc cgccgcccgc 2760 

gtgacccaga tcctgagcag cctgaccatc acccagctgc tgaagcgcct gcaccagtgg 2820 

atcaacgagg actgcagcac cccctgcagc ggcagctggc tgcgcgacgt gtgggactgg 2880 

atctgcaccg tgctgaccga cttcaagacc tggctgcaga gcaagctgct gccccagctg 2940 

cccggcgtgc ccttcttcag ctgccagcgc ggctacaagg gcgtgtggcg cggcgacggc 3000 

atcatgcaga ccacctgccc ctgcggcgcc cagatcaccg gccacgtgaa gaacggcagc 3060 

atgcgcatcg tgggccccaa gacctgcagc aacacctggc acggcacctt ccccatcaac 3120 

gcctacacca ccggcccctg cacccccagc cccgccccca actacagccg cgccctgtgg 3180 

cgcgtggccg ccgaggagta cgtggaggtg acccgcgtgg gcgacttcca ctacgtgacc 3240 

ggcatgacca ccgacaacgt gaagtgcccc tgccaggtgc ccgcccccga gttcttcacc 33 00 

gaggtggacg gcgtgcgcct gcaccgctac gcccccgcct gccgccccct gctgcgcgag 3360 

gaggtgacct tccaggtggg cctgaaccag tacctggtgg gcagccagct gccctgcgag 3420 

cccgagcccg acgtggccgt gctgaccagc atgctgaccg accccagcca catcaccgcc 3480 

gagaccgcca agcgccgcct ggcccgcggc agccccccca gcctggccag cagcagcgcc 3540 

agccagctga gcgcccccag cctgaaggcc acctgcacca cccaccacgt gagccccgac 3600 

gccgacctga tcgaggccaa cctgctgtgg cgccaggaga tgggcggcaa catcacccgc 3660 

gtggagagcg agaacaaggt ggtggtgctg gacagcttcg accccctgcg cgccgaggag 3720 

gacgagcgcg aggtgagcgt gcccgccgag atcctgcgca agagcaagaa gttccccgct 37 80 

gccatgccca tctgggctag acctgattac aaccctcccc tgctggagag ctggaaggac 3840 

cctgattacg tgcctccagt ggtgcatggc tgtcctctgc ctcccattaa agcccctcct 3900 

attccacctc ctaggcgcaa aaggaccgtg gtgctgacag aaagcagcgt gagctctgct 3960 

ctggccgaac tggccaccaa gacctttggc agcagcgaga gctctgccgt ggacagcgga 4020 

a*cagccaccg ctctgcctga ccaggccagc gacgacggcg ataagggcag cgatgtggag 4080 

agctatagca gcatgcctcc cctggaaggc gaacctggcg atcccgatct gagcgatggc 4140 

agctggagca ccgtgagcga agaggccagc gaggacgtgg tgtgttgcag catgagctac 42 00 

acctggacag gcgctctgat cacaccctgc gctgccgagg agagcaagct gcccatcaac 4260 

gccctgagca acagcctgct gaggcaccac aacatggtgt acgccaccac cagcaggtct 4320 

gccggactga ggcagaagaa ggtgaccttc gaccgcctgc aggtgctgga cgaccactac 4380 

cgcgatgtgc tgaaggagat gaaggccaag gccagcaccg tgaaggccaa gctgctgagc 4440 

gtggaggagg cctgcaagct gacccccccc cacagcgcca agagcaagtt cggctacggc 4500 

gccaaggacg tgcgcaacct gagcagcaag gccgtgaacc acatccacag cgtgtggaag 4560 

gacctgctgg aggacaccgt gacccccatc gacaccacca tcatggccaa gaacgaggtg 4620 

ttctgcgtgc agcccgagaa gggcggccgc aagcccgccc gcctgatcgt gttccccgac 4680 

ctgggcgtgc gcgtgtgcga gaagatggcc ctgtacgacg tggtgagcac cctgccccag 4740 

gtggtgatgg gcagcagcta cggcttccag tacagccccg gccagcgcgt ggagttcctg 4800 
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gtgaacacct ggaagagcaa gaagaacccc atgggcttca gctacgacac ccgctgcttc 4860 

gacagcaccg tgaccgagaa cgacatccgc gtggaggaga gcatctacca gtgctgcgac 4920 

ctggcccccg aggcccgcca ggccatcaag agcctgaccg agcgcctgta catcggcggc 4980 

cccctgacca acagcaaggg ccagaactgc ggctaccgcc gctgccgcgc cagcggcgtg 5040 

ctgaccacca gctgcggcaa caccctgacc tgctacctga aggccagcgc cgcctgccgc 5100 

gccgccaagc tgcaggactg caccatgctg gtgaacgccg ccggcctggt ggtgatctgc 5160 

gagagcgccg gcacccagga ggacgccgcc agcctgcgcg tgttcaccga ggccatgacc 5220 

cgctacagcg ccccccccgg cgaccccccc cagcccgagt acgacctgga gctgatcacc 5280 

agctgcagca gcaacgtgag cgtggcccac gacgccagcg gcaagcgcgt gtactacctg 5340 

acccgcgacc ccaccacccc cctggcccgc gccgcctggg agaccgcccg ccacaccccc 5400 

gtgaacagct ggctgggcaa catcatcatg tacgccccca ccctgtgggc ccgcatgatc 5460 

ctgatgaccc acttcttcag catcctgctg gcccaggagc agctggagaa ggccctggac 5520 

tgccagatct acggcgcctg ctacagcatc gagcccctgg acctgcccca gatcatcgag 5580 

cgcctgcacg gcctgagcgc cttcagcctg cacagctaca gccccggcga gatcaaccgc 5640 

gtggccagct gcctgcgcaa gctgggcgtg ccccccctgc gcgtgtggcg ccaccgcgcc 5700 

cgcagcgtgc gcgcccgcct gctgagccag ggcggccgcg ccgccacctg cggcaagtac 5760 

ctgttcaact gggccgtgaa gaccaagctg aagctgaccc ccatccccgc cgccagccag 5820 

ctggacctga gcggctggtt cgtggccggc tacagcggcg gcgacatcta ccacagcctg 5880 

agccgcgccc gcccccgctg gttcatgctg tgcctgctgc tgctgagcgt gggcgtgggc 5940 

atctacctgc tgcccaaccg ctaaa 5965 

<210> 12 

<211> 10 

<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Ribosome binding site 

<400> 12 

gccaccaugg . ^ 

<210> 13 
<211> 49 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic polyadenylation signal 
<400> 13 

aauaaaagau cuuuauuuuc auuagaucug uguguugguu uuuugugug 49 

*v ... 

<210> 14 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Additional nucleotides present in pVIJns-NS 
<400> 14 

tctagagcgt ttaaaccctt aattaagg 28 
<210> 15 
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<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Additional nucleotides present in pVUns-NSOPTmut 
<400> 15 

tttaaatgtt taaac 15 

<210> 16 
<211> 24 
<212> DMA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 16 

tcgaatcgat acgcgaacct acgc 24 

<210> 17 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 17 

tcgacgtgtc gacttcgaag cgcacaccaa aaacgtc 37 
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